Prompt Inversion > Blog > Detecting AI-Generated Content: How to Spot the Bots 

March 31, 2025

Detecting AI-Generated Content: How to Spot the Bots 

We explore various methods to detect whether text is generated by LLMs.

As LLM output quality begins to surpass humans, more and more AI generated content continues to flood the internet. Distinguishing between what’s written by a human and what’s churned out by a chatbot is becoming increasingly crucial. There are several algorithms for this task, but all of them are unreliable. However, there are clear telltale signs to spot when AI’s at work. In this blog post, we’ll explore these hard-to-miss indicators and some advanced techniques used to detect AI-generated text.

General Telltale Signs of AI-Generated Content

  • Lack of personal touch: One interesting tell is that output is extremely impersonal and lacks anecdotes, since any human generated content reflects the subtle emotions and biases of its creator. For instance, when asked about its personal favorite books, a chatbot lists the benefits of various literary genres and notable books within each but doesn't express a personal favorite.
  • Too much information: Another sign is that the output answers questions unrelated to the original question, such as giving excessive amounts of background knowledge.  Let’s say you ask about an apple. Instead of a simple response like, “An apple is a fruit,” a chatbot launches into a monologue describing its shape, different colors, textures and nutritional benefits. The unnecessary detail makes the answers feel oddly formal and detached. 
  • Uniform sentence structure: AI often uses uniform sentence structures and wordings, and tends to repeat itself. You might see multiple sentences that begin with phrases like “One reason for this phenomenon is…” or “Another factor to consider is...”

ChatGPT Specific Signs

  • Overused phrases: ChatGPT overuses certain neutral words and phrases such as “tapestry,” “align seamlessly,” “leadership prowess,” or “commitment to continuous improvement. If you catch these in an article, you might be looking at bot content.
  • Excessive wordiness: ChatGPT is much, much wordier than humans. If you see a simple point that is explained with multiple paragraphs or with lots of repetitive text, it is likely ChatGPT.
  • Neutral tone: ChatGPT text tends to be very uniform with a neutral tone. People are emotional! Anything written by a person will try to unconsciously convey a message. ChatGPT writes “machine text” with no opinions or emotions.

While telltale signs can hint at AI-generated content, they aren’t foolproof. To tackle this problem more systematically, researchers have developed advanced detection methods like watermarking and LLM Binoculars.

Watermarking LLMs

LLMs generate one token at a time by probabilistic sampling. They assign every possible token a probability of being the next token in a sentence and then sample a token at random based on the assigned probabilities. Researchers proposed that we split tokens into lists of green  (allowed) and red (disallowed) tokens. The probabilities are tweaked to use more green words while still occasionally using red ones.

This creates a hidden pattern in the text that doesn’t change how it reads but makes AI-generated content detectable. By counting the number of green words, researchers can identify whether the text was written by an AI.

LLM Binoculars


Current AI detectors often struggle with high error rates, but in “Spotting LLMs with Binoculars: Zero Shot Detection of Machine-Generated Text,” researchers proposed a promising approach called LLM Binoculars. They use “perplexity” (surprise), a measure of how natural-sounding (or how “perplexing”) English text is. Lower perplexity corresponds to more uniform text, while high perplexity corresponds to surprising text. It is a general phenomenon that LLMs generate text with lower perplexity than humans.

However, classifying text as LLM-generated due to having low perplexity often fails. Researchers introduced the “capybara problem”: If we give ChatGPT a weird prompt like “Can you write a few sentences about a capybara that is an astrophysicist?” we get back high perplexity answers about capybaras since the content is not normal English.

To circumvent the capybara problem, the researchers classify text as LLM-generated based on another feature called “cross-perplexity,” which measures how surprising a string is compared to a new baseline, the expected output of an LLM. Using both perplexity and cross-perplexity allowed the researchers to detect LLM-generated text with very high accuracy.

Conclusion

As AI continues to blur the lines between machine- and human-written content, innovations like this become essential for maintaining trust and integrity in fields such as education, journalism and literature.

Recent blog posts

LLMs

Detecting AI-Generated Content: How to Spot the Bots 

We explore various methods to detect whether text is generated by LLMs.

March 31, 2025
Albert Chen
Read more
LLMs
Agents

Choosing the Right Agentic Framework II

We finish grading six of the most popular agentic frameworks: LangChain’s LangGraph, Microsoft’s AutoGen, Pydantic’s PydanticAI, CrewAI, OpenAI’s Swarm, and Hugging Face’s Smolgents

March 24, 2025
Tejas Gopal
Read more
Agents
LLMs

Choosing the Right Agentic Framework I

We dissect six of the most popular agentic frameworks: LangChain’s LangGraph, Microsoft’s AutoGen, Pydantic’s PydanticAI, CrewAI, OpenAI’s Swarm, and Hugging Face’s Smolgents

March 17, 2025
Tejas Gopal
Read more