Method note

How AI writing detection works

Plain English first: the diagnostic looks for clusters of machine-like writing habits — stock phrases, flat rhythm, structural templates. Formal terms second: it combines phrase, structure, and rhythm signals into a bounded 0–100 score with transparent, individually inspectable flags.

Run Diagnostic Browse pattern library

The short version

You paste text. The diagnostic splits it into sentences, measures rhythm variation, checks for known phrase and structure patterns, and returns a score with notes explaining what triggered it. Everything runs in the browser. The minimum is 50 words; 150+ gives more stable results.

The point is not the number. The point is that you can read the flags, find them in the text, and decide for yourself whether the tool is reacting to something real.

Signal families

The diagnostic currently tracks five pattern families. Each one is documented with examples and rewrite guidance in the Pattern Library.

AI-scented vocabulary — words like "delve," "utilize," "landscape," "moreover" that cluster unnaturally in generated drafts.
AI-typical phrases — templates that let a model keep talking before it says anything: "it is important to note," "in today's rapidly evolving," "a comprehensive overview."
Throat-clearing intros and formulaic conclusions — openings that announce what the text will do instead of doing it, and endings that summarize what the reader just read.
Metronomic rhythm — passages where sentence lengths barely vary, producing the smooth, even cadence detectors associate with generated prose.
Over-signposting and pivot crutches — "first / second / finally" ladders and "not just X, but Y" constructions repeated past the point of usefulness.

What the tool returns

Each run produces:

A slop score (0–100).
Five rhythm metrics: word count, sentence count, average sentence length, sentence-length standard deviation, and burstiness.
Flag-level detail for every triggered pattern: patternId, label, severity, count, and detectorNote.

The flags are the useful part. The score is a summary. If you only look at the score, you are using the tool wrong.

How the score is assembled

Each flag carries a severity weight. Repeated flags contribute more than isolated ones. The weighted total is normalized into the 0–100 range so results stay comparable across different sample lengths.

In practice: a score of 65 might come from many moderate flags (a few stock phrases, mild rhythm flatness, some over-signposting) or from fewer but more severe ones (heavy phrase-template repetition across the whole sample). The flags tell you which case you are looking at. The number alone does not.

What this score means

A higher score means the sample overlaps more strongly with AI-typical surface patterns in phrasing, structure, and rhythm. Rough calibration:

0–15: Minimal overlap. Few or no flags triggered. The text does not resemble common AI output at the surface level.
16–40: Moderate overlap. Some patterns present — worth inspecting the flags, but many human-written texts land here, especially formal or heavily edited prose.
41–70: Strong overlap. Multiple pattern families triggered. The text deserves close review, but context (genre, revision history, house style) still matters before drawing conclusions.
71–100: Very strong overlap. Dense clustering of AI-typical patterns. If this is supposed to be original writing, something needs investigation — but investigation, not accusation.

These ranges are interpretive guidelines, not hard thresholds. A 42 in an academic abstract means something different from a 42 in a personal essay.

What this score does not mean

The score does not prove who wrote the text. It does not prove cheating. It does not prove intent. It should never be used as standalone evidence in any process that carries real consequences for a person. Formal prose, second-language writing, and carefully edited copy can all trigger elevated scores without any AI involvement. For the broader reliability case, read Do AI Detectors Work? and the False Positive Hall of Fame.

Why transparent flags matter

A detector that only gives you a percentage is asking you to trust it. WROITER shows the pattern notes because the whole point is that you should not trust any tool blindly — including this one. Verify the flags in the text. If you cannot find what the tool is reacting to, the score is not useful. That is also why the Limitations page sits next to the method page, not buried below it.

Reliability discipline

Interpret every score with context: genre, revision history, known failure modes. Compare against real-world examples in the False Positive Hall of Fame. Read How AI Detectors Work for the broader market framing. If the stakes are high, the detector is the beginning of your process, not the end of it.

Run Diagnostic Open WROITER App