# Slop Score Specification v1.0

**Status:** Published  
**Effective date:** 2026-04-13  
**Canonical URL:** https://wroiter.com/method/spec/  
**Maintained by:** WROITER / 3AM Energy  
**License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)  
**Reference implementation:** [wroiter.com/diagnostic/](https://wroiter.com/diagnostic/)

---

## 1. Introduction

This document specifies the Slop Diagnostic — a heuristic method for detecting surface-level patterns associated with AI-generated text. The specification defines the input requirements, detector inventory, scoring algorithm, output format, and known failure modes for version 1.0 of the method.

The diagnostic does not determine authorship. It measures overlap between a text sample and a documented set of structural, lexical, rhythmic, and rhetorical patterns that appear disproportionately in large-language-model output.

WROITER is the reference implementation of this specification. The specification is published under CC BY 4.0 — anyone may implement, fork, or build on this method with attribution.

---

## 2. Definitions

- **Sample** — the text submitted for analysis (minimum 50 words)
- **Flag** — a single triggered detector, with a severity level, a count, and a human-readable detail string
- **Score** — a normalized integer 0–100 representing aggregate pattern density
- **Signal family** — a category grouping detectors by the type of evidence they collect
- **Severity** — one of `high`, `medium`, or `low`, assigned per detector based on signal specificity
- **Fragile pattern** — a detector whose signal is unreliable in isolation; subject to score dampening when no corroborating detectors fire
- **Direct leak pattern** — a detector whose signal is highly specific to AI output and triggers a score bonus independent of co-occurrence

---

## 3. Input Requirements

- Type: plain text string
- Minimum length: 50 words
- No preprocessing required; the diagnostic handles sentence splitting and normalization internally
- Optimal sample length for stable results: 150 words or more
- Score reliability degrades for samples under 100 words

---

## 4. Detector Inventory

Twenty-two detectors are defined across six signal families. Each entry specifies: detector ID, label, signal family, detection logic, activation threshold, and severity.

### 4.1 Lexical Family

**`BANNED_WORDS` — AI-Scented Vocabulary** · severity: `medium`

Checks for the presence of 38 vocabulary items disproportionately common in AI output:

> delve, tapestry, landscape, multifaceted, pivotal, vibrant, foster, underscore, testament, intricate, groundbreaking, renowned, embark, navigate, realm, crucial, paramount, endeavor, holistic, synergy, leverage, utilize, robust, seamless, comprehensive, myriad, plethora, uncover, unveil, streamline, harness, empower, spearhead, bolster, catalyze, cornerstone, game-changer, cutting-edge

A subset of 12 terms is designated **soft-banned** (reduced signal weight): *landscape, vibrant, foster, navigate, crucial, leverage, utilize, robust, seamless, comprehensive, myriad, plethora.*

Triggers when: ≥2 non-soft-banned terms are present, OR ≥4 total terms with ≥1 non-soft-banned, OR ≥5 total terms at density >5 per 1000 words.

Fragile pattern.

---

**`BANNED_PHRASES` — AI-Typical Phrases** · severity: `high`

Checks for 27 stock phrase patterns, including:

> "in today's rapidly evolving landscape," "it's important to note," "it's worth noting," "plays a key/crucial/vital/pivotal role," "let's dive/delve/explore/dig," "at the end of the day," "it goes without saying," "whether you're [X] or [Y]," "there are [N] key reasons/ways/steps/benefits/strategies"

Triggers when: any one or more phrase patterns match.

---

**`HEDGING` — Hedging Language** · severity: `medium`

Detects epistemic hedge constructions: *"it could be argued," "one might say/argue/suggest," "it seems that," "arguably," "it's possible that," "to some extent," "this suggests that," "this could indicate."*

Triggers when: ≥2 matches.

Fragile pattern.

---

**`EMPTY_INTENSIFIERS` — Empty Intensifiers** · severity: `medium`

Detects overuse of adverbial intensifiers that add emphasis without meaning:

> incredibly, extremely, remarkably, exceptionally, undeniably, undoubtedly, absolutely, fundamentally, essentially, particularly, significantly, profoundly, tremendously, vastly

Triggers when: ≥3 matches AND density >3 per 1000 words.

---

**`TRANSITION_OVERUSE` — Transition Word Overuse** · severity: `medium`

Counts formal transition words:

> However, Moreover, Nevertheless, Furthermore, Consequently, Additionally, Nonetheless, Therefore, Thus, Hence, Accordingly, Meanwhile, Subsequently, Alternatively, Conversely

Triggers when: ≥4 matches AND density >4 per 1000 words.

Fragile pattern.

---

### 4.2 Meta Family

**`META_INTRO` — Throat-Clearing Intro** · severity: `high`

Scans the first paragraph (or first 300 characters) for openings that announce the text's own intent rather than making a concrete point: *"In this article/guide/post," "We will explore/discuss/examine," "Let's dive in," "This guide will cover."*

Triggers when: pattern matches in the opening block.

---

**`META_OUTRO` — Formulaic Conclusion** · severity: `medium`

Scans the final third of the text for stock summary markers: *"In conclusion," "In summary," "To sum up," "To wrap up," "The key takeaway is."*

Triggers when: pattern matches in the final third.

---

**`ESSAY_THESIS_ANNOUNCEMENT` — Essay-Thesis Announcement** · severity: `high`

Detects the school-essay template: an explicit "Introduction:" heading paired with a thesis-announcement sentence (*"This essay will explore…," "The purpose of this paper is…," "This paper argues…"*).

Triggers when: both conditions present in the opening block.

Fragile pattern.

---

**`OVER_SIGNPOST` — Over-Signposting** · severity: `medium`

Counts explicit sequence markers: *First/Firstly, Second/Secondly, Third/Thirdly, Finally, Additionally, Furthermore, Moreover.*

Triggers when: ≥3 matches.

---

**`MICRO_SUMMARY` — Compulsive Micro-Summaries** · severity: `medium`

Detects mid-text restatement markers: *"Overall," "Taken together," "In essence," "At its core," "Put simply," "Simply put," "The key takeaway here is," "What this shows is."*

Triggers when: ≥2 matches.

---

**`ANSWER_SCAFFOLDING` — Answer Scaffolding** · severity: `medium`

Detects helper-style response framing. Opening cues: *"Certainly!" "Sure!" "Absolutely!" "Here's a breakdown," "Below is."* Internal cues: *"let me break this down," "based on a few criteria," "the following criteria," "to help you decide."*

Triggers when: ≥1 match.

Fragile pattern.

---

### 4.3 Structure Family

**`UNIFORM_RHYTHM` — Metronomic Rhythm** · severity: `high`

Computes word counts per sentence, then counts **uniform windows**: consecutive runs of 4 sentences where the longest is within 3 words of the shortest. Also computes burstiness (sentence-length standard deviation ÷ mean). Text with >35% dialogue-like sentences is partially exempted.

Triggers when: ≥2 uniform windows AND burstiness <0.35 AND dialogue-sentence ratio <0.35.

Fragile pattern.

---

**`PARA_UNIFORMITY` — Uniform Paragraph Length** · severity: `low`

Counts sentences per paragraph; computes mean and standard deviation across all paragraphs.

Triggers when: ≥4 paragraphs AND (std dev ÷ mean) <0.20 AND mean sentence-count between 2.5 and 5.0.

Fragile pattern.

---

**`SEGMENTED_EXPOSITORY_BLOCKS` — Segmented Expository Blocks** · severity: `low`

Detects neatly chunked exposition: multiple medium-length paragraphs, no dialogue, no academic citation tail (APA-style inline citations or a References section).

Triggers when: ≥3 paragraphs AND average paragraph word count 55–220 AND ≥3 paragraphs individually in the 55–220 word range AND no dialogue-opening paragraphs AND no citation tail.

Fragile pattern.

---

**`SECTION_LABEL_SCAFFOLDING` — Section Label Scaffolding** · severity: `medium`

Detects explicit structural labels left in the prose: *Introduction:, Conclusion:, Body Paragraph 1:, Abstract:, Section 2:.*

Triggers when: ≥1 match.

Fragile pattern.

---

**`OPENER_REPETITION` — Sentence Opener Repetition** · severity: `medium`

Extracts the first two words of each sentence and counts repetitions. A set of common structural openers is exempted (e.g., *"it is," "in the," "we are," "there are"*) and requires ≥7 repetitions before triggering.

Triggers when: ≥5 sentences share a non-exempt opener pattern AND (≥2 distinct repeated openers OR top opener appears ≥5 times).

Fragile pattern.

---

**`PASSIVE_OVERUSE` — Passive Voice Overuse** · severity: `low`

Counts passive constructions: auxiliary verbs (is/are/was/were/been/being/gets/got) followed by a past participle.

Triggers when: ≥6 sentences in sample AND ≥4 passive constructions AND passive rate >35%.

Fragile pattern.

---

**`COLON_LIST` — Colon-List Pattern** · severity: `low`

Detects the enumeration pattern `X: A, B, and C.`

Triggers when: ≥2 instances.

Fragile pattern.

---

**`OUTLINE_LIST_FORMAT` — Outline/List Response Format** · severity: `low`

Counts numbered list markers (`1.`) and unordered list markers (`*` or `-`).

Triggers when: ≥3 numbered markers, OR ≥4 bullet markers, OR ≥2 numbered and ≥2 bullet markers.

Fragile pattern.

---

**`LABELED_LIST_FORMAT` — Labeled List Formatting** · severity: `low`

Detects list items that begin with a bolded mini-heading (`**Term**`) or title-case label (`Term:`).

Triggers when: ≥2 labeled list items.

Fragile pattern.

---

**`RULE_OF_THREE` — Compulsive Triads** · severity: `low`

Detects three-item comma-separated enumerations of the form `A, B, and C`.

Triggers when: ≥4 triads AND density >3 per 1000 words.

Fragile pattern.

---

**`SUBORDINATE_REPETITION` — Subordinate Clause Repetition** · severity: `low`

Detects sentences that open with subordinate clause starters: *while, although, despite, even though, given that, considering that, whereas, notwithstanding.*

Triggers when: ≥4 subordinate-opening sentences AND density >4 per 1000 words.

Fragile pattern.

---

### 4.4 Rhetoric Family

**`PIVOT_CRUTCH` — Pivot Crutch** · severity: `medium`

Detects the rhetorical inversion template: *"it's not just [X] but [Y]," "isn't just about," "this isn't just about."*

Triggers when: ≥1 match.

---

**`WEASEL_ATTRIBUTION` — Vague Weasel Attributions** · severity: `medium`

Detects unsourced expert and study attributions: *"experts say," "many researchers believe," "studies show," "it is widely accepted," "observers note," "critics argue."*

Triggers when: ≥2 matches.

---

**`FALSE_BALANCE` — Both-Sides-ism** · severity: `medium`

Detects reflexive balance framing: *"on one hand...on the other hand," "while some argue...others believe," "proponents...while critics/opponents/skeptics."*

Triggers when: ≥2 matches.

---

### 4.5 Persona Family

**`ASSISTANT_PERSONA` — Assistant Persona Leakage** · severity: `high`

Detects chatbot voice artifacts: *"Great question!" "I'd be happy to help," "I'm happy to help," "Let me break this down," "Hope this helps," "Feel free to ask," "Don't hesitate to reach out," "Here's a breakdown."*

Triggers when: ≥1 match.

**Direct leak pattern.** Presence activates a co-occurrence bonus regardless of other detectors.

---

**`AI_DISCLAIMER` — AI Self-Disclosure** · severity: `high`

Detects AI identity language: *"as an AI," "as a language model," "as of my knowledge cutoff," "I cannot access real-time information," "my training data," "I was trained."*

Triggers when: ≥1 match.

**Direct leak pattern.**

---

### 4.6 Style Family

**`COPULA_AVOIDANCE` — Fancy Verb Substitution** · severity: `low`

Detects substitution of basic copulas with over-elevated alternatives where "is" or "are" would be natural:

> serves as, stands as, acts as, functions as, operates as, works as, doubles as, remains as

Triggers when: ≥3 matches.

Fragile pattern.

---

## 5. Scoring Algorithm

### 5.1 Severity Weights

| Severity | Base weight |
|----------|-------------|
| `high`   | 22          |
| `medium` | 12          |
| `low`    | 6           |

### 5.2 Per-Flag Contribution

For each triggered flag:

```
contribution = weight[severity] × min(count, 4)
```

Count is capped at 4 to prevent a single high-frequency pattern from dominating the score.

### 5.3 Isolation Adjustments

Applied only when no direct leak pattern is present in the flag set:

| Condition | Multiplier |
|-----------|------------|
| Single flag, fragile pattern | 0.35 |
| Single flag, non-fragile pattern | 0.55 |
| Single signal family, fragile pattern | 0.70 |

### 5.4 Co-Occurrence Bonus

```
bonus = (any direct leak pattern ? 8 : 0)
      + max(0, distinct_signal_families − 1) × 6
      + max(0, total_flags − 2) × 3
```

The family-diversity and flag-count terms reward the corroboration of evidence across independent signal types.

### 5.5 Final Score

```
score = clamp(round(sum_of_contributions + bonus), 0, 100)
```

### 5.6 Interpretation Bands

| Score range | Interpretation |
|-------------|----------------|
| 0–7   | No significant AI-typical pattern density |
| 8–29  | Low — some patterns present |
| 30–100 | High — substantial AI-typical pattern density |

These thresholds are informational. Score bands should be interpreted in the context of sample length, genre, and any known false-positive risk factors (see Section 7).

---

## 6. Output Format

A diagnostic result consists of:

```json
{
  "score": 0,
  "version": "1.3.4",
  "flags": [
    {
      "patternId": "BANNED_PHRASES",
      "label": "AI-Typical Phrases",
      "severity": "high",
      "count": 3,
      "detail": "Found templates: \"in today's rapidly evolving landscape\", ...",
      "detectorNote": "Template phrases are a common signal in detector lexical models."
    }
  ]
}
```

Flags are returned sorted by severity (high → medium → low). The `detail` field contains human-readable evidence grounded in the source text. The `detectorNote` field explains the diagnostic relevance of the pattern.

The `version` field reflects the reference implementation version, not the specification version.

---

## 7. Known Failure Modes

### False Positives — Human Text Incorrectly Scored High

- **Academic and institutional prose** — constrained vocabulary, formal structure, and low rhythm variation overlap with detector signals. Risk is highest for UNIFORM_RHYTHM, BANNED_WORDS, TRANSITION_OVERUSE, and PASSIVE_OVERUSE.
- **Second-language writing** — non-native writers tend toward safe, common phrasings that share surface features with AI output.
- **Heavily edited text** — multiple rounds of editing flatten stylistic variation and can raise rhythm and vocabulary scores above the baseline of the original draft.
- **Canonical and historical texts** — older formal registers occasionally match detector patterns by coincidence. Known instances are documented at [wroiter.com/blog/false-positive-hall-of-fame/](https://wroiter.com/blog/false-positive-hall-of-fame/).
- **Short samples (< 100 words)** — a single triggered pattern can dominate the score disproportionately. FRAGILE_ISOLATED_PATTERNS dampening partially mitigates this, but short-sample scores should be treated as preliminary.

### False Negatives — AI Text Incorrectly Scored Low

- **Selectively edited AI drafts** — targeted editing of the specific patterns tracked here produces lower scores without changing underlying authorship.
- **Style-transfer prompting** — models prompted to write in a specific human voice suppress many surface patterns this method detects.
- **Hybrid authorship** — human-outlined, AI-drafted text (or vice versa) may not trigger enough patterns to reach a meaningful score threshold.

### Genre-Specific Unreliability

| Genre | Risk | Affected detectors |
|-------|------|--------------------|
| Legal prose | High FP | UNIFORM_RHYTHM, PASSIVE_OVERUSE, BANNED_WORDS |
| Product copy | High FP | BANNED_PHRASES, OVER_SIGNPOST |
| Academic abstracts | High FP | UNIFORM_RHYTHM, TRANSITION_OVERUSE, PASSIVE_OVERUSE |
| Personal essays | Low FP | All detectors most reliable here |

---

## 8. What the Score Does Not Establish

- The score does not identify the author of a text.
- A high score does not prove AI generation.
- A low score does not prove human authorship.
- The diagnostic should not be used as the sole basis for disciplinary action, public accusation, or irreversible decisions affecting individuals.

For safe review policy guidance, see [wroiter.com/method/limitations/](https://wroiter.com/method/limitations/).

---

## 9. Versioning

The specification version is independent of the reference implementation version. The current implementation (v1.3.4 at time of this specification's publication) implements this specification.

When detection logic, signal thresholds, or scoring weights change materially, the specification version increments. Previous version documents remain accessible at their original URLs.

| Spec version | Date | Implementation version | Notes |
|---|---|---|---|
| 1.0 | 2026-04-13 | 1.3.4 | Initial publication. 22 detectors, 6 signal families. |

---

*This specification is published under [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/). Reference implementation copyright WROITER / 3AM Energy. To report errors or suggest amendments, open an issue at the project repository.*
