1. Introduction
This document specifies the WROITER Pattern Profile—a heuristic method for detecting surface-level patterns associated with AI-generated text. The specification defines the input requirements, detector inventory, output format, and known failure modes for version 2.0 of the method.
The diagnostic does not determine authorship. It measures overlap between a text sample and a documented set of structural, lexical, rhythmic, and rhetorical patterns that appear disproportionately in large-language-model output. It surfaces findings — what was detected, where, how often — and stops short of compressing them into a single verdict number. See Why we removed the score for the rationale.
WROITER is the reference implementation of this specification. The specification is published under CC BY 4.0—anyone may implement, fork, or build on this method with attribution.
2. Definitions
| Term | Definition |
|---|---|
| Sample | The text submitted for analysis (minimum 50 words). |
| Flag | A single triggered detector, with a severity level, a count, and a human-readable detail string. |
| Profile | The diagnostic’s user-facing output: per-detector flags, grouped by signal family, with counts and locations. The profile is the headline; there is no derived score. |
| Signal family | A category grouping detectors by the type of evidence they collect (six families: lexical, meta, structure, rhetoric, persona, style). |
| Severity | One of high, medium, or low, assigned per detector based on signal specificity. |
| Fragile pattern | A detector whose signal is unreliable in isolation. Surfaced as a marker on the flag; consumers (UI, MCP, API) decide how to weight isolated fragile-only findings. |
| Direct leak pattern | A detector whose signal is highly specific to AI output (assistant-persona phrasing, AI self-disclosure). Surfaced as a marker on the flag. |
3. Input Requirements
- Type: plain text string
- Minimum length: 50 words
- No preprocessing required; the diagnostic handles sentence splitting and normalization internally
- Optimal sample length for stable results: 150 words or more
- Pattern coverage is less reliable for samples under 100 words — a single triggered detector dominates the profile, and density-based detectors lack the sample to fire confidently
- The detector inventory targets English-language surface patterns. Non-English input is not rejected, but most detectors will not fire meaningfully; word-count for non-whitespace-separated languages (CJK) is unreliable
4. Detector Inventory
Thirty-one detectors are defined across six signal families. Each entry specifies: detector ID, label, signal family, detection logic, activation threshold, and severity. Fragile patterns are marked; their presence in isolation is a weaker signal than presence alongside corroborating detectors. Direct leak patterns are marked; their presence is highly specific to AI output regardless of co-occurrence.
4.1 Lexical Family
Checks for the presence of 38 vocabulary items disproportionately common in AI output:
A subset of 12 terms is designated soft-banned (reduced signal weight, requiring higher density thresholds): landscape, vibrant, foster, navigate, crucial, leverage, utilize, robust, seamless, comprehensive, myriad, plethora.
Checks for 30 stock phrase patterns, including: “in today’s rapidly evolving landscape,” “rapidly evolving landscape,” “it’s important to note,” “it is important to note,” “it’s worth noting,” “in the realm of,” “plays a key/crucial/vital/pivotal role,” “commitment to excellence,” “at the forefront,” “paves the way,” “a testament to,” “let’s dive/delve/explore/dig,” “at the end of the day,” “the bottom line is,” “it goes without saying,” “needless to say,” “stands as a,” “serves as a,” “whether you’re [X] or [Y],” “the good news is,” “here’s the thing,” “the reality is,” “there are [N] key/main/critical/important reasons/ways/factors/steps/benefits/strategies/tips/things.”
Intentional cross-family overlap. Three rhetorical-pivot patterns (“not just [X] but,” “isn’t just about,” “it’s not just”) also live in PIVOT_CRUTCH (rhetoric family). The same surface phrase contributes to both family counts; the corroboration is by design.
Detects epistemic hedge constructions across 13 pattern groups, including: “it could be argued,” “one might say/argue/suggest,” “it seems that/like,” “it appears that/to,” “arguably,” “it’s possible/plausible/conceivable that,” “to some extent,” “in some ways,” “it’s worth considering/mentioning/pointing out,” “this suggests that,” “this could/might/may indicate/suggest/imply.”
Detects overuse of adverbial intensifiers that add emphasis without meaning:
Counts formal transition words:
4.2 Meta Family
Scans the first paragraph (or first 300 characters) for openings that announce the text’s own intent rather than making a concrete point. Article types include article, post, guide, piece, blog, essay. Action verbs include explore, discuss, examine, look at, dive into, delve into, cover. Patterns include: “In this [article/post/guide/piece/blog/essay] ...,” “We’ll/We will explore/discuss/examine/cover...,” “This [article/post/guide] will explore/discuss/examine/cover...,” “Let’s/Let us dive/delve/explore/dig in/into/deep.”
Allows up to 40 characters of preamble before the templated phrase, so “Hi there! In this article we will explore...” still triggers but “Well anyway, after a long preamble, in this article...” does not.
Scans the final third of the text for stock summary markers, including: “In conclusion,” “In summary,” “In closing,” “To sum up,” “To summarize,” “To conclude,” “To wrap up,” “The key takeaway is,” “The main takeaway is,” “The key point is,” “The main point is.”
Detects the school-essay template: an explicit Introduction: heading at the start of the opening block paired with a thesis-announcement sentence. Document types accepted: essay, paper, article, discussion, analysis. Announcement verbs accepted: will, aims to, seeks to, examines, explores, discusses, analyzes, compares, considers, evaluates. Plus standalone phrasings: “the purpose of this [essay/paper/article],” “this paper argues,” “this essay argues.”
Counts explicit sequence markers: First/Firstly, Second/Secondly, Third/Thirdly, Finally, Additionally, Furthermore, Moreover.
Matching is case-sensitive — see §4.7 Implementation notes.
Detects mid-text restatement markers across three pattern classes:
- Standalone openers (matched with trailing comma): “Overall,” “Taken together,” “All in all,” “In essence,” “At its core,” “Put simply,” “Simply put.”
- Takeaway constructions: “The key takeaway is,” “The main takeaway is,” “The central takeaway is,” “The important takeaway is” (with optional intermediate “here”).
- What-this-shows constructions: “What this means is,” “What this shows is,” “What this demonstrates is,” “What this illustrates is.”
Detects helper-style response framing.
- Opening cues (matched at the very start of the text): “Certainly!” “Sure!” “Absolutely!” “Of course!” “Okay, here…” “Okay, let’s…” “Okay, I’ll…” “Here’s…” “Here are…” “Below is.”
- Internal cues (matched anywhere): “I’ll/I will consider:” “based on a few/several criteria,” “the following criteria/items/steps/points/components/elements,” “missing the following,” “to implement this,” “to help you decide,” “described in a few sentences each.”
The flag count is capped at 3 by design. When a single block contains several clustered internal cues, the cap prevents one helper-shaped paragraph from dominating the profile.
Note: “let me/let’s break this down” was previously listed here as an internal cue. In engine v1.3.x it is matched by ASSISTANT_PERSONA (it reads as chatbot voice rather than answer structure); the spec is now aligned with the implementation.
Detects label-as-frame constructions and pre-digestion prefaces that tell the reader what to think before showing the facts.
- Label-as-frame (matched at sentence start, requires the colon): “The distinction:” “The takeaway:” “The truth:” “The reality:” “The contradiction:” “The insight:” “The catch:” “The key point/insight/takeaway:” “The important thing:” “The interesting part:”
- Pre-digestion prefaces (matched anywhere): “Here’s the thing,” “Here’s what no one’s saying,” “Here’s what nobody tells you,” “Here’s what matters,” “Here’s what I think/see,” “Here’s what’s interesting/telling/revealing,” “this matters because,” “the contradiction is,” “the key insight/takeaway/point [here] is.”
Source: the anti-AI-writing working notes — present the facts and stop. Don’t preface them with meta-commentary; the reader can connect contradictions on their own, and pre-chewing patronizes.
4.3 Structure Family
Computes word counts per sentence, then counts uniform windows—consecutive runs of 4 sentences where the longest is within 3 words of the shortest. Also computes burstiness (sentence-length standard deviation ÷ mean). Text with >35% dialogue-like sentences (opening with a quotation mark) is partially exempted.
Counts sentences per paragraph; computes mean and standard deviation across all paragraphs. Requires ≥4 paragraphs to activate.
Detects neatly chunked exposition: multiple medium-length paragraphs, no dialogue, no academic citation tail (APA-style inline citations or a References section).
Detects explicit structural labels left in the prose: Introduction:, Conclusion:, Body Paragraph 1:, Abstract:, Section 2:.
Extracts the first two words of each sentence and counts repetitions. A set of common structural openers (“it is,” “in the,” “we are,” “there are,” etc.) is exempted and requires ≥7 repetitions rather than ≥5 before triggering.
Counts passive constructions: auxiliary verbs (is/are/was/were/been/being/gets/got) followed by a past participle.
Detects the enumeration pattern after a colon, including: “X: A, B, and C,” “X: A, B, or C,” “X: A and B,” and equivalent two-or-more-item forms. Both and and or connectors are accepted.
Counts numbered list markers (1.) and unordered list markers (* or -).
Detects list items that begin with a bolded mini-heading (**Term**) or a title-case label (Term:).
Detects three-item comma-separated enumerations of the form A, B, and C.
Each enumeration item is capped at 1–2 words by design. The cap suppresses false positives from longer comma sequences that aren’t true triads — without it the detector fires on most descriptive lists; with it, only tight rhetorical triads (“identifiable, mechanical, and surgically correctable”) qualify. Matching is case-sensitive on and — see §4.7 Implementation notes.
Detects sentences that open with subordinate clause starters: while, although, despite, even though, given that, considering that, whereas, notwithstanding.
4.4 Rhetoric Family
Detects rhetorical inversion templates across three structural forms:
- Single-sentence form: “it’s not just [X] but [Y],” “it’s not only [X] but [Y],” “it’s not merely [X] but [Y],” “it’s not about [X] but [Y],” “isn’t just about,” “this isn’t just about.” The first variant accepts a phrase of 1–50 characters between the modifier and but.
- Bare-pivot form (added in engine v1.4.1): “This/It isn’t [X]. It’s/It is/This is [Y]” and “This/It is not [X]. It’s/It is/This is [Y]” — the same rhetorical inversion split across two sentences instead of joined by but. The first-sentence body is bounded to 1–80 non-period characters.
- Escalation form (added in engine v1.5.0): “[X] doesn’t just [Y], it [Z]” / “[X] doesn’t only [Y], they [Z]” / “[X] don’t merely [Y], that [Z]” — abstract-subject inversion where the second clause uses an it/they/that/this pronoun referring back to the abstract noun. Manufactures stakes through a fake “wait, there’s more” reveal. The regex requires the second-clause pronoun to be it/they/that/this; legitimate parallel constructions about specific people (“she doesn’t just walk, she runs”) use he/she/we and intentionally don’t fire.
Intentional cross-family overlap. Three single-sentence-form patterns also appear in BANNED_PHRASES (lexical family). The same surface phrase contributes to both family counts by design.
Detects unsourced expert and study attributions across nine pattern classes, including: “experts say/agree/believe/note/suggest/warn/recommend/point out,” “many/some/several/numerous experts/researchers/scholars/analysts/observers/commentators say/agree/believe/argue/suggest,” “researchers have found/note/suggest/argue/believe,” “studies show/suggest/indicate/demonstrate/reveal,” “some/many argue/believe/say/contend/maintain that,” “observers note/say/point out,” “critics argue/say/contend/maintain,” “according to experts/researchers/analysts,” “it is widely/generally/commonly believed/accepted/known/recognized/understood/acknowledged.”
Detects reflexive balance framing across five pattern classes, including: “on one hand / on the other hand,” “while some/many/certain/numerous argue/believe/contend/maintain,” “(but/yet/however,) others argue/believe/suggest/counter/disagree,” “there are (valid) arguments/points/merits on both sides / for and against,” “proponents/supporters... while critics/opponents/detractors/skeptics.”
Detects discovery/burial framing verbs that stage a routine fact as a dramatic reveal, including: “buried in the changelog / docs / fine print / footnotes / terms / announcement / post / release notes / spec,” “quietly slipped / added / launched / released / introduced / removed / deprecated / shipped / landed / rolled out / pushed / dropped / published” (same with “silently”), “snuck in / into / past,” “hidden in the changelog / docs / fine print / footnotes / details / spec / release notes,” “under the radar,” “while no one was watching / looking.”
Source: the anti-AI-writing working notes — when content is already interesting, framing it as a discovery story signals that the writer doesn’t trust the material. The reader sees the stagecraft and discounts the underlying claim.
4.5 Persona Family
Detects chatbot voice artifacts, including: “great question,” “that’s a great question,” “good/excellent question,” “glad you asked,” “I’d/I would be happy to,” “I’m/I am happy to help,” “let me/let’s break this/it down,” “let’s unpack/explore this,” “hope this helps,” “feel free to ask/reach out/contact/let me know,” “don’t hesitate to,” “I hope this/that helps/answers/clarifies,” “let me explain/walk you through,” “here’s a/here’s what I [breakdown/summary/overview].”
Note: “let me break this/it down” was previously listed under ANSWER_SCAFFOLDING internal cues in v1.0. The pattern reads as chatbot-voice (this detector) rather than answer-structure; v2.0 aligns the spec with the engine, which has always matched it here.
Detects AI identity language, including: “as an AI / as an AI language model,” “as a language model,” “as of my [last/latest] [knowledge/training] update/cutoff/data,” “I cannot/can’t/don’t access/have access to real-time/current/live [information],” “my training data,” “my knowledge cutoff,” “I’m/I am [just/only] [an] AI/artificial intelligence/language model/chatbot/virtual assistant,” “I was trained.”
Detects permission-asking framings and credibility-asserting labels — phrases that announce the writing as honest, direct, or candid rather than demonstrating those properties through the content itself.
Pattern classes, including: “let me be direct / honest / frank / candid / straight / clear,” “let me tell you,” “I’ll be straight / honest / direct / candid / real / frank / clear (with you),” “to be honest / candid / frank / direct / clear” (when followed by punctuation), “if I’m being honest / direct / candid / real / frank / clear,” “real talk,” “no-nonsense,” “the honest truth / answer / assessment / version,” “the real story / deal / truth / talk,” “let’s be / let’s get real / honest / direct / frank / clear.”
Source: the anti-AI-writing working notes — “Was the article prior to that dishonest? Why did you feel the need to say the next part is honest?” Real honest writing doesn’t announce itself; the reader infers it from substance. The label only advertises the gap.
ASSISTANT_PERSONA and AI_DISCLAIMER4.6 Style Family
Detects substitution of basic copulas with over-elevated alternatives where “is” or “are” would be natural:
Whitespace between the verb and as is matched with \s+, so line-wrapped occurrences (“serves\nas”) trigger correctly.
4.7 Implementation notes
A few engine details that affect how detectors fire in practice:
- Case-sensitive matching for sentence-position-sensitive patterns. Three detectors use case-sensitive regex (
OVER_SIGNPOST,TRANSITION_OVERUSE, and the and connector insideRULE_OF_THREE). These patterns are most reliably AI-typical when they appear at sentence start with conventional capitalization. Lowercase sentence-internal occurrences of the same words are usually grammatical (e.g. “...and finally...” mid-clause) rather than scaffolding markers. Other lexical detectors use case-insensitive matching (/gi). - Underscore-prefixed flag properties. The engine attaches engine-internal evidence to flags using an underscore prefix (e.g.
_phrases,_assistantHits,_disclaimerHits,_pivots,_sectionLabels). These properties power the revision engine’s location resolution and are not part of the public spec contract. External integrations should treat them as opaque and not depend on their shape. - Engine-internal helpers shared by multiple detectors. Sentence and paragraph splitting are shared utilities — every sentence-based detector reads the same splitter output, so a fix to the splitter (such as the v1.3.6 abbreviation-aware split) propagates uniformly across the inventory.
- English-only inventory. All 28 detectors target English-language surface patterns. The diagnostic does not reject non-English input but it will not produce a meaningful profile for it. Word-count behavior for non-whitespace-separated languages (CJK) is unreliable and may fail the 50-word minimum even on substantial paragraphs.
5. Calibration Harness
The detector thresholds in §4 are calibrated against a published corpus of human and AI samples. The harness lives at site/tests/diagnostic-calibration.mjs. Outputs land in shared/diagnostic/calibration/: the assembled corpus (diagnostic-calibration-corpus.json), per-item analysis results (diagnostic-calibration-results.json), and a human-readable report (diagnostic-calibration-report.md) that backs /method/calibration/.
5.1 Corpus assembly
The corpus is assembled from public sources at first run and cached locally afterward.
Human samples are drawn from:
- Project Gutenberg literature — classic novels (Pride and Prejudice, Jane Eyre, Moby-Dick, The Adventures of Sherlock Holmes, and Tolstoy / Dostoevsky / Verne titles). 3 passages each, sampled from the body of each work to skip front matter.
- Wikinews articles — modern journalistic prose, pulled via the Wikinews API.
- HuggingFace Reddit TIFU dataset — modern long-form internet writing.
AI samples are drawn from HuggingFace datasets covering the three frontier model families:
- ShareGPT-GPT4 (
shibing624/sharegpt_gpt4) — GPT-4 conversational outputs. - Gryphe Claude-3.5-Sonnet SlimOrca (
ChaoticNeutrals/Gryphe-Claude_Sonnet-3.5-SlimOrca-140k-ShareGPT) — Claude 3.5 Sonnet outputs. - PJMixers-Dev WildChat Gemini-2.0-Flash (
PJMixers-Dev/allenai_WildChat-1M-gemini-2.0-flash-exp-ShareGPT) — Gemini 2.0 Flash outputs.
All sources are version-tagged (CORPUS_VERSION = "2026-04-05" at the time of this spec). Refreshing the corpus is a deliberate version bump, not silent drift.
5.2 Quality filters
Every sample passes through layered filters before entering the corpus:
- English long-form check — minimum 90 words, ASCII letter ratio ≥85% (filters CJK and translated archives that didn’t decode cleanly), common-English stopword ratio ≥8% (catches OCR garbage and machine-translation residue).
- Lead-matter stripping — Gutenberg-style paragraphs are filtered to drop chapter / book / part / essay / section headings, roman-numeral-only fragments (
^[IVXLCDM0-9 .-]{1,20}$), bracketed metadata (e.g.[Illustration]), and front-matter prose before the first content paragraph. - Length-windowed trimming — every sample is trimmed to a contiguous paragraph window in the 140–420 word range. Samples too short to form a window get a fallback token-cap at 420 words; samples too long are truncated to the first window.
The filters are deliberately conservative — they prefer to exclude marginal samples over admitting them. The v1.3.4 audit (Pass 2 finding fr009) confirmed the harness is structurally sound.
5.3 Deterministic hashing and per-group stratification
The corpus is split into train and holdout sets in a way that is both stratified (every source group is represented in both sets in proportion to its size) and deterministic (the same corpus produces the same split, so calibration runs are reproducible across machines).
The split uses an FNV-1a 32-bit hash (stableHash, with FNV-1a’s standard 2166136261 offset and 16777619 prime) keyed on groupKey:sampleId. Within each group (e.g. human:literature, human:wikinews, human:reddit-tifu, ai:gpt, ai:claude, ai:gemini), samples are ranked by their hash and the top fraction is assigned to holdout. The choice is deterministic without depending on Math.random or system clocks.
5.4 Holdout allocation
HOLDOUT_RATIO = 0.25 (25%). The per-group rule:
- Groups of fewer than 3 samples → 0 holdout (group goes entirely to train; too small to split meaningfully).
- Groups of exactly 3 → 1 holdout (the minimum that leaves ≥2 train samples).
- Groups of 4 or more →
round(N × 0.25)holdout, clamped so the train set retains at least 2 samples.
This guarantees that every multi-sample group contributes to both sides of the split, preventing one source from dominating either train or holdout.
5.5 Outputs
The harness produces three artifacts under shared/diagnostic/calibration/:
diagnostic-calibration-corpus.json— the assembled, filtered, version-tagged corpus.diagnostic-calibration-results.json— per-item diagnostic results (flags, score, metrics) plus aggregated metrics at the historical legacy threshold of 45, the current production threshold (MEDIUM_SLOP_THRESHOLD = 8), and a train-tuned recommended threshold.diagnostic-calibration-report.md— human-readable summary used to render /method/calibration/: holdout size and per-group allocation, threshold-by-threshold precision / recall / F1 / human-false-positive rate, and the top AI false negatives by source.
Note on threshold artifacts: per the v2.0 no-score decision, the user-facing UI no longer surfaces a score, so the “production threshold” line in the calibration report is a historical and engineering reference rather than a customer-facing classification line. Calibration still uses thresholded metrics for tuning the detector inventory; the diagnostic UI still ships the underlying detectors.
5.6 Network dependencies and cold-start cost
The first harness run fetches all source data over HTTP from Project Gutenberg, the Wikinews API, and the HuggingFace dataset CDN (Reddit TIFU and the three AI ShareGPT-style datasets). Subsequent runs read from the cached shared/diagnostic/calibration/diagnostic-calibration-corpus.json and incur no network calls.
For continuous integration, the cached corpus is the production artifact. CI does not re-bootstrap from upstream. Refreshing the corpus to a new CORPUS_VERSION requires a manual re-bootstrap on a development machine and a fresh commit of the cached artifacts (a known limitation per audit fr008). If upstream URLs change, the corpus has to be re-bootstrapped before calibration can run; this is rare but not unprecedented in the HuggingFace dataset CDN.
6. Output Format
A diagnostic result consists of:
6.1 Fields
version— the reference implementation (engine) version, not the spec version.slopScore— deprecated as of spec v2.0 (engine v1.4.0). The 0–100 score is no longer surfaced in the WROITER UI. The field remains in the JSON output during the deprecation window for backward compatibility and will be removed in a future major release. See Why we removed the score for the rationale.metrics— text-level summary statistics:wordCount,sentenceCount,avgSentenceLength,sentenceLengthStdDev, andburstiness(sentence-length standard deviation ÷ mean).flags— array of triggered detectors, sorted by severity (high → medium → low). Each flag carriespatternId,label,severity,count,detail(human-readable evidence grounded in the source text), anddetectorNote(diagnostic relevance of the pattern). Implementations may attach engine-internal properties to flags using an underscore prefix (e.g._phrases,_assistantHits); these are not part of the public spec contract and should not be relied on by external integrations.disclaimer— boilerplate text reminding consumers that the diagnostic does not establish authorship.
6.2 Profile assembly
The user-facing profile is derived from the flags array. The reference implementation groups flags by signal family, sums their count values, and renders a per-family breakdown plus a per-flag drilldown. The diagnostic block format used by downstream tools (the revision engine and any future MCP / browser-extension consumers) is { instances_total, by_family, flags_count } — a count of facts, not a derived metric.
7. Known Failure Modes
7.1 False Positives — Human Text Incorrectly Flagged
- Academic and institutional prose — constrained vocabulary, formal structure, and low rhythm variation overlap with detector signals. Risk is highest for
UNIFORM_RHYTHM,BANNED_WORDS,TRANSITION_OVERUSE, andPASSIVE_OVERUSE. - Second-language writing — non-native writers tend toward safe, common phrasings that share surface features with AI output.
- Heavily edited text — multiple rounds of editing flatten stylistic variation and raise the rhythm and vocabulary signal density above the baseline of the original draft.
- Canonical and historical texts — older formal registers occasionally match detector patterns by coincidence. Documented instances: False Positive Hall of Fame.
- Short samples (<100 words) — a single triggered pattern can dominate the profile disproportionately. Density-based detectors lack the sample to fire confidently. Short-sample findings should be treated as preliminary.
7.2 False Negatives — AI Text Not Flagged
- Selectively edited AI drafts — targeted editing of the specific patterns tracked here suppresses flags without changing underlying authorship.
- Style-transfer prompting — models prompted to write in a specific human voice suppress many surface patterns this method detects.
- Hybrid authorship — human-outlined, AI-drafted text (or vice versa) may not trigger enough patterns to surface a meaningful profile.
7.3 Genre-Specific Unreliability
| Genre | FP risk | Most affected detectors |
|---|---|---|
| Legal prose | High | UNIFORM_RHYTHM, PASSIVE_OVERUSE, BANNED_WORDS |
| Product copy | High | BANNED_PHRASES, OVER_SIGNPOST |
| Academic abstracts | High | UNIFORM_RHYTHM, TRANSITION_OVERUSE, PASSIVE_OVERUSE |
| Personal essays | Low | All detectors most reliable here |
8. What the Profile Does Not Establish
- The profile does not identify the author of a text.
- A high instance count does not prove AI generation.
- A zero or low count does not prove human authorship.
- The diagnostic should not be used as the sole basis for disciplinary action, public accusation, or irreversible decisions affecting individuals.
For safe review policy guidance, see Limitations and False Positives.
9. Versioning
The specification version is independent of the reference implementation version. The current reference implementation is engine v1.5.0.
When detection logic, signal thresholds, the detector inventory, or the output schema change materially, the specification version increments. Previous version documents remain accessible at their archive URLs.
| Spec version | Date | Implementation | Notes |
|---|---|---|---|
| 2.1 (current) | 2026-05-28 | 1.5.0 | Three new detectors added from the anti-AI-writing working notes: CREDIBILITY_THEATER (persona / direct leak — "let me be direct", "real talk", "the honest truth"), TELEGRAPHED_REVEAL (meta — "The takeaway:", "Here’s the thing", "this matters because"), MANUFACTURED_DRAMA (rhetoric — "buried in the changelog", "quietly slipped", "snuck in"). PIVOT_CRUTCH extended with the escalation form ("[X] doesn’t just Y, it Z"). Detector inventory header count updated 28 → 31. |
| 2.0 | 2026-05-27 | 1.4.0 | Doc renamed Slop Score Specification → Pattern Profile Specification. Scoring algorithm and interpretation bands removed (score no longer surfaced in the UI; JSON field deprecated). Detector inventory header count corrected (22 → 28). Definitions section reframed around the profile rather than the score. Detector entries expanded from illustrative example sets to enumerative coverage (HEDGING, META_INTRO, META_OUTRO, ESSAY_THESIS_ANNOUNCEMENT, MICRO_SUMMARY, ANSWER_SCAFFOLDING, COLON_LIST, PIVOT_CRUTCH, WEASEL_ATTRIBUTION, FALSE_BALANCE, ASSISTANT_PERSONA, AI_DISCLAIMER); engine-internal constraints documented (RULE_OF_THREE 1–2 word cap, COPULA_AVOIDANCE line-wrap matching, ANSWER_SCAFFOLDING count cap). New §4.7 Implementation notes documenting case-sensitivity intent, underscore-prefixed flag properties, shared splitter utilities, and English-only scope. New §5 Calibration Harness documenting the calibration corpus, deterministic hashed stratification, holdout allocation, output artifacts, and network dependencies. |
| 1.0 | 2026-04-13 | 1.3.4 | Initial publication. 22 detectors, 6 signal families, scoring algorithm with 0–100 output. Archived. |
This specification is published under CC BY 4.0. Reference implementation copyright WROITER / 3AM Energy.