Spotting AI slop · FeedBagel Help

Why bother detecting AI?

The open web has a flood problem. Content farms generate thousands of articles a day from the same handful of prompts, and a lot of them now show up in RSS feeds. They're grammatically fine, often mid-length, sometimes ranked highly by search engines. They just don't have anything in them.

We don't want to ban AI-assisted writing. People use AI to translate, summarise, and tidy up their drafts, and that's fine. What we want to demote is articles that read like nobody wrote them. That's what the detector flags.

How does the score work?

The detector runs over the article body and counts roughly 20 patterns that are known to appear more in machine-written text than human text. Each pattern contributes points; the points are summed and capped at 100. Higher = more AI-like.

It's a heuristic, not a model. There's no neural network making the call, no API in the loop. The advantage is you can read this page and see exactly what it's measuring, and we can adjust patterns as the AI writing style evolves. The trade-off is that a careful editor can polish AI output past our patterns. We treat the score as one signal among several, not a verdict.

What patterns we look for

Roughly in order of how confidently each pattern points to AI:

The “delve” vocabulary cluster. Words like delve, showcase, underscore, pivotal, intricate, multifaceted, nuanced, comprehensive, meticulous, tapestry, landscape, leverage, navigate, foster. A Stanford study analysed 14 million scientific abstracts and found these words spiked 5 to 25 times their pre-ChatGPT frequency after late 2022. The strongest single signal we have.
Discourse-marker overuse. AI loves connectives: moreover, furthermore, additionally, however, thus, therefore, in conclusion, overall. Humans use them sparingly. Studies of news writing show LLMs cluster heavily around these.
Hedging filler. “It is important to note”, “it is worth noting”, “it should be noted”, “keep in mind that”. Padding that adds no information.
Formulaic openers. “In today's fast-paced world”, “in the realm of”, “in the ever-evolving”, “when it comes to”. Boilerplate from a thousand SEO templates.
Formulaic closers. “In conclusion”, “overall”, “all in all”, “at the end of the day”.
Vague universals. “A myriad of”, “a plethora of”, “a wide array of”, “throughout history”, “navigating the complexities”. Quantifies without quantifying.
Participle openers. Sentences that start with Leveraging…, Utilizing…, Ensuring…, Empowering…. CMU researchers found LLMs over-use this construction.
Sentence-length uniformity. Humans vary sentence length wildly. AI text clusters around a smooth average. Low variance is a tell.
Low lexical diversity. Human writing reaches for varied vocabulary. AI text re-uses the same words. Measured as the ratio of unique words to total words.
Metaphor cluster. “Tapestry of life”, “journey through”, “the fabric of”, “stand the test of time”. Stock metaphors AI reaches for when asked to wrap up.
Em-dash density. Above eight em-dashes per thousand words and the article starts losing points. Above fifteen and it loses more. Writers who genuinely love em-dashes sit around 5 per thousand; slop routinely runs 15–30. The threshold is set high enough that a human stylist won't get punished.
Dead giveaways. Strings like “as an AI language model”,“as of my last update”,“I cannot provide”. Rare, but when one shows up the article gets a hard floor at 85.

What we deliberately don't check

Oxford commas. Folklore, not data.
Tone or topic. We don't care if the article is positive, negative, political, niche. The detector reads style, not content.

What does the score actually mean?

0–30: likely human. Few or none of the patterns above trip.
30–60: mixed. Some patterns, could be lightly AI-assisted, could be a formal-style writer. Don't read too much into it.
60–80: likely AI. Multiple strong patterns clustering. Worth a closer look.
80–100: very likely AI. Hits across most categories, or a direct AI giveaway.

How does this affect what I see?

The score feeds into the bagel rating: high AI-likelihood pulls the bagel score down, very high scores cap it at the Stale tier. It does not currently hide articles from feeds or tag pages. You can still read everything; the badge just tells you what you're looking at.

Sources

The pattern set is grounded in published work, not vibes. Headline references:

Kobak et al. (2024), Delving into ChatGPT usage in academic writing through excess vocabulary. The source of the “delve” cluster.
Liang et al. (Stanford HAI, 2024), Mapping the Increasing Use of LLMs in Scientific Papers. Corroborated the Kobak findings on arXiv and bioRxiv.
Muñoz-Ortiz et al. (2023), Contrasting Linguistic Patterns in Human and LLM-Generated News Text. The source for the discourse-marker, sentence-length, and lexical-diversity findings.
Reinhart et al. (CMU, 2024), Do LLMs write like humans? The source for present-participle clause overuse.