Prompt Inversion

As LLM output quality begins to surpass humans, more and more AI generated content continues to flood the internet. Distinguishing between what’s written by a human and what’s churned out by a chatbot is becoming increasingly crucial. There are several algorithms for this task, but all of them are unreliable. However, there are clear telltale signs to spot when AI’s at work. In this blog post, we’ll explore these hard-to-miss indicators and some advanced techniques used to detect AI-generated text.

‍

General Telltale Signs of AI-Generated Content

Lack of personal touch: One interesting tell is that output is extremely impersonal and lacks anecdotes, since any human generated content reflects the subtle emotions and biases of its creator. For instance, when asked about its personal favorite books, a chatbot lists the benefits of various literary genres and notable books within each but doesn't express a personal favorite.

Too much information: Another sign is that the output answers questions unrelated to the original question, such as giving excessive amounts of background knowledge. Let’s say you ask about an apple. Instead of a simple response like, “An apple is a fruit,” a chatbot launches into a monologue describing its shape, different colors, textures and nutritional benefits. The unnecessary detail makes the answers feel oddly formal and detached.

Uniform sentence structure: AI often uses uniform sentence structures and wordings, and tends to repeat itself. You might see multiple sentences that begin with phrases like “One reason for this phenomenon is…” or “Another factor to consider is...”

‍

ChatGPT Specific Signs

Overused phrases: ChatGPT overuses certain neutral words and phrases such as “tapestry,” “align seamlessly,” “leadership prowess,” or “commitment to continuous improvement. If you catch these in an article, you might be looking at bot content.

Excessive wordiness: ChatGPT is much, much wordier than humans. If you see a simple point that is explained with multiple paragraphs or with lots of repetitive text, it is likely ChatGPT.

Neutral tone: ChatGPT text tends to be very uniform with a neutral tone. People are emotional! Anything written by a person will try to unconsciously convey a message. ChatGPT writes “machine text” with no opinions or emotions.

While telltale signs can hint at AI-generated content, they aren’t foolproof. To tackle this problem more systematically, researchers have developed advanced detection methods like watermarking and LLM Binoculars.

‍

Watermarking LLMs

LLMs generate one token at a time by probabilistic sampling. They assign every possible token a probability of being the next token in a sentence and then sample a token at random based on the assigned probabilities. Researchers proposed that we split tokens into lists of green (allowed) and red (disallowed) tokens. The probabilities are tweaked to use more green words while still occasionally using red ones.

This creates a hidden pattern in the text that doesn’t change how it reads but makes AI-generated content detectable. By counting the number of green words, researchers can identify whether the text was written by an AI.

‍

LLM Binoculars

‍
Current AI detectors often struggle with high error rates, but in “Spotting LLMs with Binoculars: Zero Shot Detection of Machine-Generated Text,” researchers proposed a promising approach called LLM Binoculars. They use “perplexity” (surprise), a measure of how natural-sounding (or how “perplexing”) English text is. Lower perplexity corresponds to more uniform text, while high perplexity corresponds to surprising text. It is a general phenomenon that LLMs generate text with lower perplexity than humans.

However, classifying text as LLM-generated due to having low perplexity often fails. Researchers introduced the “capybara problem”: If we give ChatGPT a weird prompt like “Can you write a few sentences about a capybara that is an astrophysicist?” we get back high perplexity answers about capybaras since the content is not normal English.

To circumvent the capybara problem, the researchers classify text as LLM-generated based on another feature called “cross-perplexity,” which measures how surprising a string is compared to a new baseline, the expected output of an LLM. Using both perplexity and cross-perplexity allowed the researchers to detect LLM-generated text with very high accuracy.

Conclusion

As AI continues to blur the lines between machine- and human-written content, innovations like this become essential for maintaining trust and integrity in fields such as education, journalism and literature. For now, relying on tools to detect AI-generated text still produces too many false positives, potentially harming students and employees falsely-accused of cheating.

Nonetheless, I believe humans will naturally get better at detecting AI-written content, just as we have of photoshopped images... or at least we'll become much more skeptical of everything we see, read and hear!

‍

LLMs

About the author:

Albert Chen, Chief AI Strategist & Head of Growth

Albert is an AI strategist and prompt engineer with 18 years of startup, corporate, and nonprofit experience across the US and Latin America. He's partnered with State Street executives on enterprise-wide IT transformations and spent five years in Mexico City scaling an NGO through tech-driven process optimizations. Merging his experience with generative AI tools, LLMs, admin, and operations, Albert helps organizations design their AI roadmaps and provides AI training for non-technical teams.

Detecting AI-Generated Content: How to Spot the Bots

General Telltale Signs of AI-Generated Content

ChatGPT Specific Signs

Watermarking LLMs

LLM Binoculars

Conclusion

Recent blog posts

AI Embeddings: Not Necessarily Secure Anymore

Keeping up with AI Advances

Beyond Chatbots: Real-World Agentic Workflows at PartyBus