Pattern Profile Specification v2.2

Status: Published
Effective date: 2026-05-28
Canonical URL: https://wroiter.com/method/spec/
Previous version: v1.0 (archived)
Maintained by: WROITER / 3AM Energy
License: CC BY 4.0
Implementation: v1.5.0

1. Introduction

This document specifies the WROITER Pattern Profile—a heuristic method for detecting surface-level patterns associated with AI-generated text. The specification defines the input requirements, detector inventory, output format, and known failure modes for version 2.0 of the method.

The diagnostic does not determine authorship. It measures overlap between a text sample and a documented set of structural, lexical, rhythmic, and rhetorical patterns that appear disproportionately in large-language-model output. It surfaces findings — what was detected, where, how often — and stops short of compressing them into a single verdict number. See Why we removed the score for the rationale.

WROITER is the reference implementation of this specification. The specification is published under CC BY 4.0—anyone may implement, fork, or build on this method with attribution.

2. Definitions

Term	Definition
Sample	The text submitted for analysis (minimum 15 words).
Flag	A single triggered detector, with a severity level, a count, and a human-readable detail string.
Profile	The diagnostic’s user-facing output: per-detector flags, grouped by signal family, with counts and locations. The profile is the headline; there is no derived score.
Signal family	A category grouping detectors by the type of evidence they collect (six families: lexical, meta, structure, rhetoric, persona, style).
Severity	One of `high`, `medium`, or `low`, assigned per detector based on signal specificity.
Fragile pattern	A detector whose signal is unreliable in isolation. Surfaced as a marker on the flag; consumers (UI, MCP, API) decide how to weight isolated fragile-only findings.
Direct leak pattern	A detector whose signal is highly specific to AI output (assistant-persona phrasing, AI self-disclosure). Surfaced as a marker on the flag.

3. Input Requirements

Type: plain text string
Minimum length: 15 words
No preprocessing required; the diagnostic handles sentence splitting and normalization internally
Optimal sample length for stable results: 150 words or more
Samples of 15–99 words get partial coverage: the density and structure detectors (rhythm, uniformity, repetition, triads) need ~100+ words to fire, so a short-sample profile reflects the lexical and phrase-level detectors only. This is what makes checking short copy — titles, meta descriptions, ad lines — useful, but it is not a full structural read.
Pattern coverage is less reliable for samples under 100 words — a single triggered detector dominates the profile, and density-based detectors lack the sample to fire confidently
The detector inventory targets English-language surface patterns. Non-English input is not rejected, but most detectors will not fire meaningfully; word-count for non-whitespace-separated languages (CJK) is unreliable

4. Detector Inventory

Thirty-three detectors are defined across six signal families. Each entry specifies: detector ID, label, signal family, detection logic, activation threshold, and severity. Fragile patterns are marked; their presence in isolation is a weaker signal than presence alongside corroborating detectors. Direct leak patterns are marked; their presence is highly specific to AI output regardless of co-occurrence.

4.1 Lexical Family

BANNED_WORDS

AI-Scented Vocabulary

Family: lexical Severity: medium

Checks for the presence of 38 vocabulary items disproportionately common in AI output:

delve, tapestry, landscape, multifaceted, pivotal, vibrant, foster, underscore, testament, intricate, groundbreaking, renowned, embark, navigate, realm, crucial, paramount, endeavor, holistic, synergy, leverage, utilize, robust, seamless, comprehensive, myriad, plethora, uncover, unveil, streamline, harness, empower, spearhead, bolster, catalyze, cornerstone, game-changer, cutting-edge

A subset of 12 terms is designated soft-banned (reduced signal weight, requiring higher density thresholds): landscape, vibrant, foster, navigate, crucial, leverage, utilize, robust, seamless, comprehensive, myriad, plethora.

Triggers when: ≥2 non-soft-banned terms present, OR ≥4 total terms with ≥1 non-soft-banned, OR ≥5 total terms at density >5 per 1000 words.

▪ Fragile pattern — dampened in isolation

BANNED_PHRASES

AI-Typical Phrases

Family: lexical Severity: high

Checks for 37 stock phrase patterns, including: “in today’s rapidly evolving landscape,” “rapidly evolving landscape,” “it’s important to note,” “it is important to note,” “it’s worth noting,” “in the realm of,” “plays a key/crucial/vital/pivotal role,” “commitment to excellence,” “at the forefront,” “paves the way,” “a testament to,” “let’s dive/delve/explore/dig,” “at the end of the day,” “the bottom line is,” “it goes without saying,” “needless to say,” “stands as a,” “serves as a,” “whether you’re [X] or [Y],” “the good news is,” “here’s the thing,” “the reality is,” “there are [N] key/main/critical/important reasons/ways/factors/steps/benefits/strategies/tips/things.”

Triggers when: any one or more phrase patterns match.

The v1.6.1 gap-closure batch added template-grade hype and empty closing flourishes that fire on first hit: “paradigm shift,” “game-changer / game-changing,” “redefine the (very) fabric of,” “results speak for themselves,” “the rest is history,” “only time will tell.”

Intentional cross-family overlap. Three rhetorical-pivot patterns (“not just [X] but,” “isn’t just about,” “it’s not just”) also live in PIVOT_CRUTCH (rhetoric family). The same surface phrase contributes to both family counts; the corroboration is by design.

HEDGING

Hedging Language

Family: lexical Severity: medium

Detects epistemic hedge constructions across 13 pattern groups, including: “it could be argued,” “one might say/argue/suggest,” “it seems that/like,” “it appears that/to,” “arguably,” “it’s possible/plausible/conceivable that,” “to some extent,” “in some ways,” “it’s worth considering/mentioning/pointing out,” “this suggests that,” “this could/might/may indicate/suggest/imply.”

Triggers when: ≥2 matches.

▪ Fragile pattern

EMPTY_INTENSIFIERS

Empty Intensifiers

Family: lexical Severity: medium

Detects overuse of adverbial intensifiers that add emphasis without meaning:

incredibly, extremely, remarkably, exceptionally, undeniably, undoubtedly, absolutely, fundamentally, essentially, particularly, significantly, profoundly, tremendously, vastly, staggering, tirelessly, seamlessly, effortlessly, relentlessly

The last five (staggering, tirelessly, seamlessly, effortlessly, relentlessly, added in v1.6.x) are vibe-inflation words carried as a density-only signal — a single occurrence stays clean.

Triggers when: ≥3 matches AND density >3 per 1000 words.

TRANSITION_OVERUSE

Transition Word Overuse

Family: lexical Severity: medium

Counts formal transition words:

However, Moreover, Nevertheless, Furthermore, Consequently, Additionally, Nonetheless, Therefore, Thus, Hence, Accordingly, Meanwhile, Subsequently, Alternatively, Conversely

Triggers when: ≥4 matches AND density >4 per 1000 words.

▪ Fragile pattern

ABSOLUTISM_OVERUSE (added in engine v1.6.2)

Absolutism Overuse

Family: lexical Severity: low

Flags repetition of a single universal quantifier or totality word:

every, each, all, always, never, none, completely, fully, entirely, totally

Repetition-based, not presence-based — one “every” is fine; the signal is hammering the same one. Calibrated against a 168-document human-prose corpus where no document repeats a single absolute more than 7 times. Total absolute density does not separate AI text from good writing (good personal essays run denser in absolutes than the document that motivated this detector); single-word repetition does.

Triggers when: one listed word occurs ≥8 times AND at density >2 per 1000 words (the density floor spares very long documents whose count is actually sparse).

4.2 Meta Family

META_INTRO

Throat-Clearing Intro

Family: meta Severity: high

Scans the first paragraph (or first 300 characters) for openings that announce the text’s own intent rather than making a concrete point. Article types include article, post, guide, piece, blog, essay. Action verbs include explore, discuss, examine, look at, dive into, delve into, cover. Patterns include: “In this [article/post/guide/piece/blog/essay] ...,” “We’ll/We will explore/discuss/examine/cover...,” “This [article/post/guide] will explore/discuss/examine/cover...,” “Let’s/Let us dive/delve/explore/dig in/into/deep.”

Allows up to 40 characters of preamble before the templated phrase, so “Hi there! In this article we will explore...” still triggers but “Well anyway, after a long preamble, in this article...” does not.

Triggers when: pattern matches in the opening block.

META_OUTRO

Formulaic Conclusion

Family: meta Severity: medium

Scans the final third of the text for stock summary markers, including: “In conclusion,” “In summary,” “In closing,” “To sum up,” “To summarize,” “To conclude,” “To wrap up,” “The key takeaway is,” “The main takeaway is,” “The key point is,” “The main point is.”

Triggers when: pattern matches in the final third of the sample.

ESSAY_THESIS_ANNOUNCEMENT

Essay-Thesis Announcement

Family: meta Severity: high

Detects the school-essay template: an explicit Introduction: heading at the start of the opening block paired with a thesis-announcement sentence. Document types accepted: essay, paper, article, discussion, analysis. Announcement verbs accepted: will, aims to, seeks to, examines, explores, discusses, analyzes, compares, considers, evaluates. Plus standalone phrasings: “the purpose of this [essay/paper/article],” “this paper argues,” “this essay argues.”

Triggers when: both the heading and the thesis-announcement are present in the opening block.

▪ Fragile pattern

OVER_SIGNPOST

Over-Signposting

Family: meta Severity: medium

Counts explicit sequence markers: First/Firstly, Second/Secondly, Third/Thirdly, Finally, Additionally, Furthermore, Moreover.

Matching is case-sensitive — see §4.7 Implementation notes.

Triggers when: ≥3 matches.

MICRO_SUMMARY

Compulsive Micro-Summaries

Family: meta Severity: medium

Detects mid-text restatement markers across three pattern classes:

Standalone openers (matched with trailing comma): “Overall,” “Taken together,” “All in all,” “In essence,” “At its core,” “Put simply,” “Simply put.”
Takeaway constructions: “The key takeaway is,” “The main takeaway is,” “The central takeaway is,” “The important takeaway is” (with optional intermediate “here”).
What-this-shows constructions: “What this means is,” “What this shows is,” “What this demonstrates is,” “What this illustrates is.”

Triggers when: ≥2 matches.

ANSWER_SCAFFOLDING

Answer Scaffolding

Family: meta Severity: medium

Detects helper-style response framing.

Opening cues (matched at the very start of the text): “Certainly!” “Sure!” “Absolutely!” “Of course!” “Okay, here…” “Okay, let’s…” “Okay, I’ll…” “Here’s…” “Here are…” “Below is.”
Internal cues (matched anywhere): “I’ll/I will consider:” “based on a few/several criteria,” “the following criteria/items/steps/points/components/elements,” “missing the following,” “to implement this,” “to help you decide,” “described in a few sentences each.”

The flag count is capped at 3 by design. When a single block contains several clustered internal cues, the cap prevents one helper-shaped paragraph from dominating the profile.

Note: “let me/let’s break this down” was previously listed here as an internal cue. In engine v1.3.x it is matched by ASSISTANT_PERSONA (it reads as chatbot voice rather than answer structure); the spec is now aligned with the implementation.

Triggers when: ≥1 match.

▪ Fragile pattern

TELEGRAPHED_REVEAL (added in engine v1.5.0)

Telegraphed Reveal

Family: meta Severity: medium

Detects label-as-frame constructions and pre-digestion prefaces that tell the reader what to think before showing the facts.

Label-as-frame (matched at sentence start, requires the colon): “The distinction:” “The takeaway:” “The truth:” “The reality:” “The contradiction:” “The insight:” “The catch:” “The key point/insight/takeaway:” “The important thing:” “The interesting part:”
Pre-digestion prefaces (matched anywhere): “Here’s the thing,” “Here’s what no one’s saying,” “Here’s what nobody tells you,” “Here’s what matters,” “Here’s what I think/see,” “Here’s what’s interesting/telling/revealing,” “this matters because,” “the contradiction is,” “the key insight/takeaway/point [here] is.”

Source: the anti-AI-writing working notes — present the facts and stop. Don’t preface them with meta-commentary; the reader can connect contradictions on their own, and pre-chewing patronizes.

Triggers when: ≥1 match.

▪ Fragile pattern

4.3 Structure Family

UNIFORM_RHYTHM

Metronomic Rhythm

Family: structure Severity: high

Computes word counts per sentence, then counts uniform windows—consecutive runs of 4 sentences where the longest is within 3 words of the shortest. Also computes burstiness (sentence-length standard deviation ÷ mean). Text with >35% dialogue-like sentences (opening with a quotation mark) is partially exempted.

Triggers when: ≥2 uniform windows AND burstiness <0.35 AND dialogue-sentence ratio <0.35.

▪ Fragile pattern

PARA_UNIFORMITY

Uniform Paragraph Length

Family: structure Severity: low

Counts sentences per paragraph; computes mean and standard deviation across all paragraphs. Requires ≥4 paragraphs to activate.

Triggers when: ≥4 paragraphs AND (std dev ÷ mean) <0.20 AND mean sentence-count between 2.5 and 5.0.

▪ Fragile pattern

SEGMENTED_EXPOSITORY_BLOCKS

Segmented Expository Blocks

Family: structure Severity: low

Detects neatly chunked exposition: multiple medium-length paragraphs, no dialogue, no academic citation tail (APA-style inline citations or a References section).

Triggers when: ≥3 paragraphs AND average paragraph word count 55–220 AND ≥3 paragraphs individually in the 55–220 word range AND no dialogue-opening paragraphs AND no citation tail.

▪ Fragile pattern

SECTION_LABEL_SCAFFOLDING

Section Label Scaffolding

Family: structure Severity: medium

Detects explicit structural labels left in the prose: Introduction:, Conclusion:, Body Paragraph 1:, Abstract:, Section 2:.

Triggers when: ≥1 match.

▪ Fragile pattern

OPENER_REPETITION

Sentence Opener Repetition

Family: structure Severity: medium

Extracts the first two words of each sentence and counts repetitions. A set of common structural openers (“it is,” “in the,” “we are,” “there are,” etc.) is exempted and requires ≥7 repetitions rather than ≥5 before triggering.

Triggers when: ≥5 sentences share a non-exempt opener pattern AND (≥2 distinct repeated openers OR top opener appears ≥5 times).

▪ Fragile pattern

PASSIVE_OVERUSE

Passive Voice Overuse

Family: structure Severity: low

Counts passive constructions: auxiliary verbs (is/are/was/were/been/being/gets/got) followed by a past participle.

Triggers when: ≥6 sentences in sample AND ≥4 passive constructions AND passive rate >35%.

▪ Fragile pattern

COLON_LIST

Colon-List Pattern

Family: structure Severity: low

Detects the enumeration pattern after a colon, including: “X: A, B, and C,” “X: A, B, or C,” “X: A and B,” and equivalent two-or-more-item forms. Both and and or connectors are accepted.

Triggers when: ≥2 instances.

▪ Fragile pattern

OUTLINE_LIST_FORMAT

Outline/List Response Format

Family: structure Severity: low

Counts numbered list markers (1.) and unordered list markers (* or -).

Triggers when: ≥3 numbered markers, OR ≥4 bullet markers, OR ≥2 numbered and ≥2 bullet markers.

▪ Fragile pattern

LABELED_LIST_FORMAT

Labeled List Formatting

Family: structure Severity: low

Detects list items that begin with a bolded mini-heading (**Term**) or a title-case label (Term:).

Triggers when: ≥2 labeled list items.

▪ Fragile pattern

RULE_OF_THREE

Compulsive Triads

Family: structure Severity: low

Detects three-item comma-separated enumerations of the form A, B, and C.

Each enumeration item is capped at 1–2 words by design. The cap suppresses false positives from longer comma sequences that aren’t true triads — without it the detector fires on most descriptive lists; with it, only tight rhetorical triads (“identifiable, mechanical, and surgically correctable”) qualify. Matching is case-sensitive on and — see §4.7 Implementation notes.

Triggers when: ≥4 triads AND density >3 per 1000 words.

▪ Fragile pattern

SUBORDINATE_REPETITION

Subordinate Clause Repetition

Family: structure Severity: low

Detects sentences that open with subordinate clause starters: while, although, despite, even though, given that, considering that, whereas, notwithstanding.

Triggers when: ≥4 subordinate-opening sentences AND density >4 per 1000 words.

▪ Fragile pattern

4.4 Rhetoric Family

PIVOT_CRUTCH

Pivot Crutch

Family: rhetoric Severity: medium

Detects rhetorical inversion templates across three structural forms:

Single-sentence form: “it’s not just [X] but [Y],” “it’s not only [X] but [Y],” “it’s not merely [X] but [Y],” “it’s not about [X] but [Y],” “isn’t just about,” “this isn’t just about.” The first variant accepts a phrase of 1–50 characters between the modifier and but.
Bare-pivot form (added in engine v1.4.1): “This/It isn’t [X]. It’s/It is/This is [Y]” and “This/It is not [X]. It’s/It is/This is [Y]” — the same rhetorical inversion split across two sentences instead of joined by but. The first-sentence body is bounded to 1–80 non-period characters.
Escalation form (added in engine v1.5.0): “[X] doesn’t just [Y], it [Z]” / “[X] doesn’t only [Y], they [Z]” / “[X] don’t merely [Y], that [Z]” — abstract-subject inversion where the second clause uses an it/they/that/this pronoun referring back to the abstract noun. Manufactures stakes through a fake “wait, there’s more” reveal. The regex requires the second-clause pronoun to be it/they/that/this; legitimate parallel constructions about specific people (“she doesn’t just walk, she runs”) use he/she/we and intentionally don’t fire.

Triggers when: ≥1 match.

Intentional cross-family overlap. Three single-sentence-form patterns also appear in BANNED_PHRASES (lexical family). The same surface phrase contributes to both family counts by design.

WEASEL_ATTRIBUTION

Vague Weasel Attributions

Family: rhetoric Severity: medium

Detects unsourced expert and study attributions across nine pattern classes, including: “experts say/agree/believe/note/suggest/warn/recommend/point out,” “many/some/several/numerous experts/researchers/scholars/analysts/observers/commentators say/agree/believe/argue/suggest,” “researchers have found/note/suggest/argue/believe,” “studies show/suggest/indicate/demonstrate/reveal,” “some/many argue/believe/say/contend/maintain that,” “observers note/say/point out,” “critics argue/say/contend/maintain,” “according to experts/researchers/analysts,” “it is widely/generally/commonly believed/accepted/known/recognized/understood/acknowledged.”

Triggers when: ≥2 matches.

FALSE_BALANCE

Both-Sides-ism

Family: rhetoric Severity: medium

Detects reflexive balance framing across five pattern classes, including: “on one hand / on the other hand,” “while some/many/certain/numerous argue/believe/contend/maintain,” “(but/yet/however,) others argue/believe/suggest/counter/disagree,” “there are (valid) arguments/points/merits on both sides / for and against,” “proponents/supporters... while critics/opponents/detractors/skeptics.”

Triggers when: ≥2 matches.

MANUFACTURED_DRAMA (added in engine v1.5.0)

Manufactured Drama

Family: rhetoric Severity: medium

Detects discovery/burial framing verbs that stage a routine fact as a dramatic reveal, including: “buried in the changelog / docs / fine print / footnotes / terms / announcement / post / release notes / spec,” “quietly slipped / added / launched / released / introduced / removed / deprecated / shipped / landed / rolled out / pushed / dropped / published” (same with “silently”), “snuck in / into / past,” “hidden in the changelog / docs / fine print / footnotes / details / spec / release notes,” “under the radar,” “while no one was watching / looking.”

Source: the anti-AI-writing working notes — when content is already interesting, framing it as a discovery story signals that the writer doesn’t trust the material. The reader sees the stagecraft and discounts the underlying claim.

Triggers when: ≥1 match.

▪ Fragile pattern

4.5 Persona Family

ASSISTANT_PERSONA

Assistant Persona Leakage

Family: persona Severity: high

Detects chatbot voice artifacts, including: “great question,” “that’s a great question,” “good/excellent question,” “glad you asked,” “I’d/I would be happy to,” “I’m/I am happy to help,” “let me/let’s break this/it down,” “let’s unpack/explore this,” “hope this helps,” “feel free to ask/reach out/contact/let me know,” “don’t hesitate to,” “I hope this/that helps/answers/clarifies,” “let me explain/walk you through,” “here’s a/here’s what I [breakdown/summary/overview].”

Note: “let me break this/it down” was previously listed under ANSWER_SCAFFOLDING internal cues in v1.0. The pattern reads as chatbot-voice (this detector) rather than answer-structure; v2.0 aligns the spec with the engine, which has always matched it here.

Triggers when: ≥1 match.

● Direct leak pattern — highly specific to unedited AI output, bypasses isolation dampening

AI_DISCLAIMER

AI Self-Disclosure

Family: persona Severity: high

Detects AI identity language, including: “as an AI / as an AI language model,” “as a language model,” “as of my [last/latest] [knowledge/training] update/cutoff/data,” “I cannot/can’t/don’t access/have access to real-time/current/live [information],” “my training data,” “my knowledge cutoff,” “I’m/I am [just/only] [an] AI/artificial intelligence/language model/chatbot/virtual assistant,” “I was trained.”

Triggers when: ≥1 match.

● Direct leak pattern

CREDIBILITY_THEATER (added in engine v1.5.0)

Credibility Theater

Family: persona Severity: high

Detects permission-asking framings and credibility-asserting labels — phrases that announce the writing as honest, direct, or candid rather than demonstrating those properties through the content itself.

Pattern classes, including: “let me be direct / honest / frank / candid / straight / clear,” “let me tell you,” “I’ll be straight / honest / direct / candid / real / frank / clear (with you),” “to be honest / candid / frank / direct / clear” (when followed by punctuation), “if I’m being honest / direct / candid / real / frank / clear,” “real talk,” “no-nonsense,” “the honest truth / answer / assessment / version,” “the real story / deal / truth / talk,” “let’s be / let’s get real / honest / direct / frank / clear.”

Source: the anti-AI-writing working notes — “Was the article prior to that dishonest? Why did you feel the need to say the next part is honest?” Real honest writing doesn’t announce itself; the reader infers it from substance. The label only advertises the gap.

Triggers when: ≥1 match.

● Direct leak pattern — bypasses isolation dampening alongside ASSISTANT_PERSONA and AI_DISCLAIMER

SELF_CONGRATULATION (added in engine v1.6.3)

Self-Congratulation

Family: persona Severity: medium

Flags praise pointed at the product or its maker instead of value delivered to the reader — sycophancy redirected. Lexical tells: naming a value as a Principle (“the transparency principle,” “our commitment to…”), superiority-by-strawman (“just another [X],” “unlike most,” “not your typical”), self-applied virtue adjectives (“an honest account,” “a candid assessment”), and deserve/owe framing (“readers deserve,” “we owe it to”). A we/you sentence-opener skew (text leading with “we/our” far more than “you”) is also tracked, but only counts when a lexical tell is present — first-person memoirs are legitimately we-heavy, so the skew alone does not fire.

Triggers when: ≥2 lexical tells, OR ≥1 tell AND ≥4 “we/our” sentence openers with a we/(we+you) ratio ≥0.7. Calibrated to zero false positives on the 168-document human-prose corpus.

Distinct from CREDIBILITY_THEATER, which flags the writer asserting their own honesty to the reader (“let me be honest”); this detector flags praise of the work itself.

4.6 Style Family

COPULA_AVOIDANCE

Fancy Verb Substitution

Family: style Severity: low

Detects substitution of basic copulas with over-elevated alternatives where “is” or “are” would be natural:

serves as, stands as, acts as, functions as, operates as, works as, doubles as, remains as

Whitespace between the verb and as is matched with \s+, so line-wrapped occurrences (“serves\nas”) trigger correctly.

Triggers when: ≥3 matches.

▪ Fragile pattern

4.7 Implementation notes

A few engine details that affect how detectors fire in practice:

Case-sensitive matching for sentence-position-sensitive patterns. Three detectors use case-sensitive regex (OVER_SIGNPOST, TRANSITION_OVERUSE, and the and connector inside RULE_OF_THREE). These patterns are most reliably AI-typical when they appear at sentence start with conventional capitalization. Lowercase sentence-internal occurrences of the same words are usually grammatical (e.g. “...and finally...” mid-clause) rather than scaffolding markers. Other lexical detectors use case-insensitive matching (/gi).
Underscore-prefixed flag properties. The engine attaches engine-internal evidence to flags using an underscore prefix (e.g. _phrases, _assistantHits, _disclaimerHits, _pivots, _sectionLabels). These properties power the revision engine’s location resolution and are not part of the public spec contract. External integrations should treat them as opaque and not depend on their shape.
Engine-internal helpers shared by multiple detectors. Sentence and paragraph splitting are shared utilities — every sentence-based detector reads the same splitter output, so a fix to the splitter (such as the v1.3.6 abbreviation-aware split) propagates uniformly across the inventory.
English-only inventory. All 30 detectors target English-language surface patterns. The diagnostic does not reject non-English input but it will not produce a meaningful profile for it. Word-count behavior for non-whitespace-separated languages (CJK) is unreliable and may fail the 15-word minimum even on substantial paragraphs.

5. Calibration Harness

The detector thresholds in §4 are calibrated against a published corpus of human and AI samples. The harness lives at site/tests/diagnostic-calibration.mjs. Outputs land in shared/diagnostic/calibration/: the assembled corpus (diagnostic-calibration-corpus.json), per-item analysis results (diagnostic-calibration-results.json), and a human-readable report (diagnostic-calibration-report.md) that backs /method/calibration/.

5.1 Corpus assembly

The corpus is assembled from public sources at first run and cached locally afterward.

Human samples are drawn from:

Project Gutenberg literature — classic novels (Pride and Prejudice, Jane Eyre, Moby-Dick, The Adventures of Sherlock Holmes, and Tolstoy / Dostoevsky / Verne titles). 3 passages each, sampled from the body of each work to skip front matter.
Wikinews articles — modern journalistic prose, pulled via the Wikinews API.
HuggingFace Reddit TIFU dataset — modern long-form internet writing.

AI samples are drawn from HuggingFace datasets covering the three frontier model families:

ShareGPT-GPT4 (shibing624/sharegpt_gpt4) — GPT-4 conversational outputs.
Gryphe Claude-3.5-Sonnet SlimOrca (ChaoticNeutrals/Gryphe-Claude_Sonnet-3.5-SlimOrca-140k-ShareGPT) — Claude 3.5 Sonnet outputs.
PJMixers-Dev WildChat Gemini-2.0-Flash (PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-exp-ShareGPT) — Gemini 2.0 Flash outputs.

All sources are version-tagged (CORPUS_VERSION = "2026-04-05" at the time of this spec). Refreshing the corpus is a deliberate version bump, not silent drift.

5.2 Quality filters

Every sample passes through layered filters before entering the corpus:

English long-form check — minimum 90 words, ASCII letter ratio ≥85% (filters CJK and translated archives that didn’t decode cleanly), common-English stopword ratio ≥8% (catches OCR garbage and machine-translation residue).
Lead-matter stripping — Gutenberg-style paragraphs are filtered to drop chapter / book / part / essay / section headings, roman-numeral-only fragments (^[IVXLCDM0-9 .-]{1,20}$), bracketed metadata (e.g. [Illustration]), and front-matter prose before the first content paragraph.
Length-windowed trimming — every sample is trimmed to a contiguous paragraph window in the 140–420 word range. Samples too short to form a window get a fallback token-cap at 420 words; samples too long are truncated to the first window.

The filters are deliberately conservative — they prefer to exclude marginal samples over admitting them. The v1.3.4 audit (Pass 2 finding fr009) confirmed the harness is structurally sound.

5.3 Deterministic hashing and per-group stratification

The corpus is split into train and holdout sets in a way that is both stratified (every source group is represented in both sets in proportion to its size) and deterministic (the same corpus produces the same split, so calibration runs are reproducible across machines).

The split uses an FNV-1a 32-bit hash (stableHash, with FNV-1a’s standard 2166136261 offset and 16777619 prime) keyed on groupKey:sampleId. Within each group (e.g. human:literature, human:wikinews, human:reddit-tifu, ai:gpt, ai:claude, ai:gemini), samples are ranked by their hash and the top fraction is assigned to holdout. The choice is deterministic without depending on Math.random or system clocks.

5.4 Holdout allocation

HOLDOUT_RATIO = 0.25 (25%). The per-group rule:

Groups of fewer than 3 samples → 0 holdout (group goes entirely to train; too small to split meaningfully).
Groups of exactly 3 → 1 holdout (the minimum that leaves ≥2 train samples).
Groups of 4 or more → round(N × 0.25) holdout, clamped so the train set retains at least 2 samples.

This guarantees that every multi-sample group contributes to both sides of the split, preventing one source from dominating either train or holdout.

5.5 Outputs

The harness produces three artifacts under shared/diagnostic/calibration/:

diagnostic-calibration-corpus.json — the assembled, filtered, version-tagged corpus.
diagnostic-calibration-results.json — per-item diagnostic results (flags, score, metrics) plus aggregated metrics at the historical legacy threshold of 45, the current production threshold (MEDIUM_SLOP_THRESHOLD = 8), and a train-tuned recommended threshold.
diagnostic-calibration-report.md — human-readable summary used to render /method/calibration/: holdout size and per-group allocation, threshold-by-threshold precision / recall / F1 / human-false-positive rate, and the top AI false negatives by source.

Note on threshold artifacts: per the v2.0 no-score decision, the user-facing UI no longer surfaces a score, so the “production threshold” line in the calibration report is a historical and engineering reference rather than a customer-facing classification line. Calibration still uses thresholded metrics for tuning the detector inventory; the diagnostic UI still ships the underlying detectors.

5.6 Network dependencies and cold-start cost

The first harness run fetches all source data over HTTP from Project Gutenberg, the Wikinews API, and the HuggingFace dataset CDN (Reddit TIFU and the three AI ShareGPT-style datasets). Subsequent runs read from the cached shared/diagnostic/calibration/diagnostic-calibration-corpus.json and incur no network calls.

For continuous integration, the cached corpus is the production artifact. CI does not re-bootstrap from upstream. Refreshing the corpus to a new CORPUS_VERSION requires a manual re-bootstrap on a development machine and a fresh commit of the cached artifacts (a known limitation per audit fr008). If upstream URLs change, the corpus has to be re-bootstrapped before calibration can run; this is rare but not unprecedented in the HuggingFace dataset CDN.

6. Output Format

A diagnostic result consists of:

{ "version": "1.4.0", "slopScore": 0, "metrics": { "wordCount": 612, "sentenceCount": 41, "avgSentenceLength": 14.9, "sentenceLengthStdDev": 4.2, "burstiness": 0.283 }, "flags": [ { "patternId": "BANNED_PHRASES", "label": "AI-Typical Phrases", "severity": "high", "count": 3, "detail": "Found templates: \"in today's rapidly evolving landscape\", ...", "detectorNote": "Template phrases are a common signal in detector lexical models." } ], "disclaimer": "This diagnostic identifies structural and lexical patterns associated with AI-typical writing. It does not determine authorship." }

6.1 Fields

version — the reference implementation (engine) version, not the spec version.
slopScore — deprecated as of spec v2.0 (engine v1.4.0). The 0–100 score is no longer surfaced in the WROITER UI. The field remains in the JSON output during the deprecation window for backward compatibility and will be removed in a future major release. See Why we removed the score for the rationale.
metrics — text-level summary statistics: wordCount, sentenceCount, avgSentenceLength, sentenceLengthStdDev, and burstiness (sentence-length standard deviation ÷ mean).
flags — array of triggered detectors, sorted by severity (high → medium → low). Each flag carries patternId, label, severity, count, detail (human-readable evidence grounded in the source text), and detectorNote (diagnostic relevance of the pattern). Implementations may attach engine-internal properties to flags using an underscore prefix (e.g. _phrases, _assistantHits); these are not part of the public spec contract and should not be relied on by external integrations.
disclaimer — boilerplate text reminding consumers that the diagnostic does not establish authorship.

6.2 Profile assembly

The user-facing profile is derived from the flags array. The reference implementation groups flags by signal family, sums their count values, and renders a per-family breakdown plus a per-flag drilldown. The diagnostic block format used by downstream tools (the revision engine and any future MCP / browser-extension consumers) is { instances_total, by_family, flags_count } — a count of facts, not a derived metric.

7. Known Failure Modes

7.1 False Positives — Human Text Incorrectly Flagged

Academic and institutional prose — constrained vocabulary, formal structure, and low rhythm variation overlap with detector signals. Risk is highest for UNIFORM_RHYTHM, BANNED_WORDS, TRANSITION_OVERUSE, and PASSIVE_OVERUSE.
Second-language writing — non-native writers tend toward safe, common phrasings that share surface features with AI output.
Heavily edited text — multiple rounds of editing flatten stylistic variation and raise the rhythm and vocabulary signal density above the baseline of the original draft.
Canonical and historical texts — older formal registers occasionally match detector patterns by coincidence. Documented instances: False Positive Hall of Fame.
Short samples (<100 words) — a single triggered pattern can dominate the profile disproportionately. Density-based detectors lack the sample to fire confidently. Short-sample findings should be treated as preliminary.

7.2 False Negatives — AI Text Not Flagged

Selectively edited AI drafts — targeted editing of the specific patterns tracked here suppresses flags without changing underlying authorship.
Style-transfer prompting — models prompted to write in a specific human voice suppress many surface patterns this method detects.
Hybrid authorship — human-outlined, AI-drafted text (or vice versa) may not trigger enough patterns to surface a meaningful profile.

7.3 Genre-Specific Unreliability

Genre	FP risk	Most affected detectors
Legal prose	High	`UNIFORM_RHYTHM`, `PASSIVE_OVERUSE`, `BANNED_WORDS`
Product copy	High	`BANNED_PHRASES`, `OVER_SIGNPOST`
Academic abstracts	High	`UNIFORM_RHYTHM`, `TRANSITION_OVERUSE`, `PASSIVE_OVERUSE`
Personal essays	Low	All detectors most reliable here

8. What the Profile Does Not Establish

The profile does not identify the author of a text.
A high instance count does not prove AI generation.
A zero or low count does not prove human authorship.
The diagnostic should not be used as the sole basis for disciplinary action, public accusation, or irreversible decisions affecting individuals.

For safe review policy guidance, see Limitations and False Positives.

9. Versioning

The specification version is independent of the reference implementation version. The current reference implementation is engine v1.6.3.

When detection logic, signal thresholds, the detector inventory, or the output schema change materially, the specification version increments. Previous version documents remain accessible at their archive URLs.

Spec version	Date	Implementation	Notes
2.2 (current)	2026-06-02	1.6.3	Two detectors added: `ABSOLUTISM_OVERUSE` (lexical — repetition of a single universal quantifier) and `SELF_CONGRATULATION` (persona — praise aimed at the product/maker, with a we/you subject-ratio gated behind a lexical tell). Detector inventory header count updated 31 → 33. Confidence for `HEDGING` and `EMPTY_INTENSIFIERS` made instance-aware (medium/low) rather than a per-detector constant. Implementation-version column reconciled — it had lagged at 1.5.0 while the engine shipped 1.6.0–1.6.3.
2.1	2026-05-28	1.5.0	Three new detectors added from the anti-AI-writing working notes: `CREDIBILITY_THEATER` (persona / direct leak — "let me be direct", "real talk", "the honest truth"), `TELEGRAPHED_REVEAL` (meta — "The takeaway:", "Here’s the thing", "this matters because"), `MANUFACTURED_DRAMA` (rhetoric — "buried in the changelog", "quietly slipped", "snuck in"). `PIVOT_CRUTCH` extended with the escalation form ("[X] doesn’t just Y, it Z"). Detector inventory header count updated 28 → 31.
2.0	2026-05-27	1.4.0	Doc renamed Slop Score Specification → Pattern Profile Specification. Scoring algorithm and interpretation bands removed (score no longer surfaced in the UI; JSON field deprecated). Detector inventory header count corrected (22 → 28). Definitions section reframed around the profile rather than the score. Detector entries expanded from illustrative example sets to enumerative coverage (HEDGING, META_INTRO, META_OUTRO, ESSAY_THESIS_ANNOUNCEMENT, MICRO_SUMMARY, ANSWER_SCAFFOLDING, COLON_LIST, PIVOT_CRUTCH, WEASEL_ATTRIBUTION, FALSE_BALANCE, ASSISTANT_PERSONA, AI_DISCLAIMER); engine-internal constraints documented (RULE_OF_THREE 1–2 word cap, COPULA_AVOIDANCE line-wrap matching, ANSWER_SCAFFOLDING count cap). New §4.7 Implementation notes documenting case-sensitivity intent, underscore-prefixed flag properties, shared splitter utilities, and English-only scope. New §5 Calibration Harness documenting the calibration corpus, deterministic hashed stratification, holdout allocation, output artifacts, and network dependencies.
1.0	2026-04-13	1.3.4	Initial publication. 22 detectors, 6 signal families, scoring algorithm with 0–100 output. Archived.

This specification is published under CC BY 4.0. Reference implementation copyright WROITER / 3AM Energy.