Published 2026-04-03 | Updated 2026-04-04

Do AI detectors work?

Short answer: they detect some machine-like signals. They are not reliable enough to serve as standalone proof of anything. The longer answer is more useful — and more important if you are building a review process around detector output.

What "work" actually means

When people ask whether AI detectors work, they usually mean one of three things — and the answer is different for each:

  • "Can they detect AI writing?" — Yes, often. Most detectors reliably flag text that was generated wholesale by a language model and published without editing. The surface patterns are strong enough that even simple detectors catch them.
  • "Are they accurate?" — It depends on the text. Accuracy degrades on formal writing, edited drafts, second-language prose, hybrid workflows, and anything from a model the detector was not trained on. Published accuracy claims (95%, 98%) are measured on curated test sets and do not generalize to all text in all contexts.
  • "Can I trust the result enough to act on it?" — Not alone. A detector score is a triage signal. It tells you what to look at more closely. It does not tell you what to conclude. The gap between "this text triggered pattern flags" and "this person cheated" is enormous, and crossing it requires evidence a detector cannot provide.

What detectors are good at

Detectors are most reliable when the text is:

  • Generated in one pass, without significant editing.
  • Longer than 150 words (short samples are unstable).
  • From a model the detector was trained on or exposed to.
  • In a genre with high expected variation (personal essays, creative writing, informal blog posts) — where the flatness of AI output is most visible.

Under these conditions, false-positive rates are low and the signal is genuinely useful. The problem is that most real-world review scenarios do not look like this.

Where detectors fail

False positives — human text flagged as AI — are not rare edge cases. They are a predictable consequence of how detectors work. The features used to identify AI writing (low variation, formal vocabulary, structural regularity) appear naturally in:

  • Academic writing — constrained by conventions that compress variation.
  • Second-language writing — conservative word choices that overlap with AI defaults.
  • Heavily edited text — editing smooths out the quirks detectors rely on.
  • Legal, medical, and institutional prose — formal and repetitive by design.

False negatives — AI text that passes undetected — are equally real. Lightly edited AI drafts, hybrid human-AI workflows, and text from newer models can sail through detectors trained on older output. A clean score does not prove human authorship any more than a high score proves AI authorship.

The False Positive Hall of Fame documents specific cases. The limitations page explains which genres and conditions carry the highest false-positive risk.

Why two detectors give different answers

Different training data, different feature weights, different thresholds. One tool may emphasize rhythm. Another may emphasize vocabulary. A text with flat rhythm but varied vocabulary will score high on one and low on the other. This is normal. If two detectors disagree, the signal is ambiguous — treat it that way. For the full mechanism breakdown, see How AI Detectors Work.

How to use detector output responsibly

If you are building a review process — for a classroom, a newsroom, a content team — these rules reduce the chance of a bad decision:

  1. Triage, not judgment. Use the score to decide what gets a closer look. Do not use it to decide the outcome.
  2. Read the flags. A detector that shows you what it reacted to is more useful than one that gives you a percentage and nothing else. The WROITER Diagnostic exposes pattern-level detail for this reason.
  3. Check context. Genre, revision history, whether the writer is working in a second language, whether the text was collaboratively edited. All of these affect what a score means.
  4. Keep provenance. Draft history, outlines, version notes, and timestamps are stronger evidence than any detector output. Build your process to collect them.
  5. Document your false positives. When you find a clear false positive, record it. Over time, your internal calibration will be more reliable than any published accuracy number.
  6. Conversation before accusation. The cost of a wrongful accusation — to a student, a writer, a colleague — is almost always higher than the cost of asking a question first.

The bottom line

AI detectors work well enough to be useful as a triage tool. They do not work well enough to be trusted as a decision tool. The difference between those two things is the difference between a responsible review process and a dangerous one. If the stakes are low, a detector score is a reasonable starting point. If the stakes are high, it is the beginning of your investigation, not the end.