How AI Detection Works: A Technical Guide (2026)

AI Detection · deep dives

How AI Detection Works: A Technical Guide (2026)

How AI detectors actually decide your text is AI — perplexity, burstiness, classifiers, and why the same paragraph can score 4% on one tool and 97% on another.

AI detection is a statistics problem dressed up as a verdict. When Turnitin returns “82% AI,” the underlying calculation is a probability score from a classifier — not a measurement, not a confession. Understanding what the classifier actually measures is the difference between treating a flag as proof and treating it as a signal.

This guide walks through the math, the architecture, the failure modes, and how the same paragraph can score 4% on one detector and 97% on another. It is the foundation post for our AI detection cluster — everything else in the cluster assumes the model described here.

StealthZero AI detector results panel showing overall AI probability and per-sentence highlighting

Which StealthZero model to use against which detector

Detector choice drives model choice. F.R.I.D.A.Y is fine-tuned against the latest GPTZero model; Jarvis-Cohera and Jarvis-Max hit 100% Turnitin bypass in internal testing; Sentinel-Lite and Sentinel-Max are the SEO-targeted family.

Detector / use caseUse this model
Latest GPTZero (fine-tuned)F.R.I.D.A.Y
Turnitin (100% bypass, internal testing)Jarvis-Cohera or Jarvis-Max
SEO content (blog, web copy)Sentinel-Lite or Sentinel-Max
General AI detection (Free tier)Origin (may need multiple passes for strict detectors)
Quality + tone controlJarvis-Cohera

Origin (Free) bypasses general AI detection, but for strict detectors like Turnitin or GPTZero, use F.R.I.D.A.Y or J.A.R.V.I.S (Cohera or Max).

Detector benchmarks and StealthZero coverage

StealthZero runs two in-house detectors (E.D.I.T.H and Sentrio v2) and bundles four third-party detectors into Proof Reports. Sentrio v2 ships four modes and enforces a 100-word minimum. Free tier covers 600 scans per month.

  • E.D.I.T.H (Shield-Lite): calibrated to match real-world Turnitin scores, no minimum word count
  • Sentrio v2: four modes (Standard, Aggressive, Multilingual, Scholar), 100-word minimum, claims 99%+ accuracy
  • Proof Reports: Turnitin + GPTZero + Winston + CopyLeaks (4 detectors per report)
  • Pricing: $2.80 single Proof Report, $12.60 5-pack (10% off), $22.40 10-pack (20% off)
  • Free tier: 600 scans/month; Pro and Premium: unlimited (fair use)
  • Liang et al. 2023 (arXiv:2304.02819) measured false-positive rates above 60% for ESL writers across multiple GPT detectors

Weber-Wulff et al. 2023 (Int J Educ Integr 19:26) benchmarked 14 detection tools and found none reached the accuracy needed to be considered reliable in academic integrity workflows — most tools either over-flagged human writing or missed machine-paraphrased AI text.

What does an AI detector actually compute?

An AI detector computes a probability — usually 0-100% — that a passage was machine-generated, by scoring statistical features (perplexity, burstiness, stylistic uniformity) against a trained classifier. It does not match against a corpus of known AI output.

Every commercial AI detector — GPTZero, Originality.ai, Winston, Copyleaks, Turnitin’s AI indicator, StealthZero’s E.D.I.T.H and Sentrio engines — does some version of the same three-step pipeline:

  1. Feature extraction. Tokenize the text and compute statistical features for each sentence and the document as a whole.
  2. Reference scoring. Compare those features against what a reference language model expects from “typical AI” versus “typical human” writing.
  3. Classification. Feed the features through a trained classifier that outputs a probability between 0 and 1.

The output you see (84% AI, Likely human, Mixed content) is a thresholded view of that probability. The two features that do most of the work are perplexity and burstiness.

Perplexity: how surprising is the next word?

Perplexity is the exponential of the cross-entropy between the text and a reference language model. In plain English: it measures how surprised a language model would be by each word in the document, given the words that came before.

  • Low perplexity means the model expected those words. The text follows statistically common patterns.
  • High perplexity means the model didn’t expect those words. The text takes routes the model considers unlikely.

Large language models are trained to minimize perplexity on their own training data. When ChatGPT writes “In conclusion, it is important to consider,” the model is choosing the most likely next tokens at every step. The output is, by construction, low-perplexity.

Humans don’t optimize for predictability. We use idiosyncratic phrases, drop articles, write fragments, and abruptly switch registers. Our average sentence-level perplexity, measured against the same reference model, tends to be meaningfully higher.

The catch: not all humans write the same way. Formal academic writing, business writing, technical documentation, ESL writing, and template-heavy genres (cover letters, lab reports, press releases) all sit closer to the AI distribution. That is the source of most false positives.

Burstiness: variance across sentences

Burstiness is the variance in sentence-level complexity. The GPTZero team formalized the metric in their 2023 paper and most detectors have adopted some version of it since.

The intuition is direct. Human writing varies. We write a long sentence with three clauses, then a short one. Then a one-word sentence. Like this. Then we ramble for two sentences before snapping back into something tight. The variance is high.

Default AI output is much smoother. Sentence lengths cluster around a similar length. Clause structure stays consistent. Paragraphs march in step. Even without measuring word-level perplexity, a detector can pick up the rhythm difference.

Here is the contrast in text. Both paragraphs describe the same idea.

Low burstiness (AI default):

The experiment did not produce the expected results. The research team had invested significant time in preparation. They had verified all variables carefully. They had sought advice from multiple experts. Nevertheless, the outcomes were different from predictions.

High burstiness (human):

The experiment failed. Completely failed. We had spent six months prepping it, triple-checking every variable, consulting with experts across three continents — and still, when the moment came, the numbers came back wrong in a way that none of us had predicted.

The first paragraph has near-uniform sentence length. The second has a one-word sentence next to a 40-word one. To a burstiness classifier, the second looks human even before any word-level scoring runs.

Stylometric and lexical features

On top of perplexity and burstiness, detectors compute a handful of other features:

  • N-gram frequency distributions. Which 2-, 3-, and 4-word sequences appear, and how often.
  • Function-word frequencies. AI overuses certain connectors (“furthermore”, “additionally”, “moreover”, “however”).
  • Punctuation patterns. Em-dash density, comma usage, semicolon frequency.
  • Sentence-opening patterns. Whether sentences start with the same constructions repeatedly.
  • Vocabulary richness. Type-token ratio, hapax legomena ratio, average word length.
  • Coherence and cohesion scores. How tightly ideas link across sentences and paragraphs.

These features feed the classifier alongside the perplexity and burstiness signals. A document scoring high on five out of six features will get flagged even if one feature looks borderline.

How does the AI detector classifier decide?

The classifier sums sentence-level probability scores into a document-level probability, weighted by passage length. Most detectors highlight sentences above a per-sentence threshold and report a probability the document is AI-generated.

The classifier is where the score is born. Most modern detectors run one of two architectures:

1. Fine-tuned transformer classifiers

The detector takes a pre-trained transformer (RoBERTa is common; some use DeBERTa or model-specific architectures) and fine-tunes it on a corpus of labeled human and AI text. At inference time, the model outputs a probability score directly.

OpenAI’s now-retired AI Text Classifier worked this way. So do most of the second-generation detectors (Winston, Copyleaks, Originality.ai’s current model, Turnitin’s AI indicator). They are end-to-end neural classifiers with the statistical features baked into the training objective rather than hand-computed.

2. Hybrid statistical + neural

GPTZero’s published architecture combines the perplexity-and-burstiness statistical pipeline with a neural model for the final decision. Their public model description on the GPTZero site says: “Our AI detection model contains 7 components that process text to determine if it was written by AI.”

The hybrid approach is more interpretable — you can show users a perplexity number per sentence — at the cost of being slightly less accurate on borderline cases than a pure end-to-end classifier.

Thresholds and confidence

Both architectures output a probability between 0 and 1. The label you see (AI, Human, Mixed) is decided by a threshold the vendor sets. GPTZero’s free tier defaults to around 0.50; their Advanced Scan uses different cutoffs. Turnitin’s AI indicator only shows a percentage; institutions decide internally what counts as actionable.

This is why two detectors can disagree on the same paragraph by 90 percentage points. They are using:

  • A different reference model for perplexity.
  • A different training corpus for the classifier.
  • A different threshold for the human-readable label.

A sentence that scores 0.34 in GPTZero’s classifier might score 0.92 in Copyleaks’. Both are “right” relative to their own training distribution. Neither is measuring an objective property of the text.

Why do detectors disagree, and what do you do about it?

Detectors disagree because they train on different corpora and weight features differently: GPTZero trains on consumer ChatGPT samples, Copyleaks on multilingual content, Originality.ai on commercial publishing. Cross-detector variance regularly exceeds 50 percentage points on the same paragraph.

The disagreement is real. Three months of testing across the StealthZero detection stack consistently shows that any given paragraph will get different scores from GPTZero, Winston, Copyleaks, and Originality.ai. Sometimes the spread is 5 percentage points. Sometimes it is 80.

Three reasons the spread happens:

  1. Training data drift. Each vendor trains on different AI output. GPTZero is heavy on ChatGPT samples. Copyleaks emphasizes multilingual training. Originality.ai focuses on commercial content. When you submit a paragraph, you are asking “does this look like the AI text we trained on?” and the answer depends on what they trained on.
  2. Reference model choice. Perplexity is computed against a reference LM. GPTZero, Originality.ai, and Winston each use their own. The same sentence has different perplexity values under different reference models.
  3. Classifier thresholds. The decision boundary between “human” and “AI” is a tuning parameter. Vendors optimize it for different goals — some minimize false positives, some maximize recall on raw AI output.

The operational answer: never trust a single detector for a high-stakes decision. If your work is being evaluated, check it against the detector your evaluator uses, then verify against a multi-detector report so you can see the full disagreement.

This is what the StealthZero Proof Report is built for. The report runs four detectors in one pass — Turnitin-parity, GPTZero, Winston, and CopyLeaks — and shows you what each of them would say. If three of the four come back clean, your output is robust to detector choice. If one detector is the only one flagging high, you know the flag is detector-specific.

Why are false positives a structural problem?

False positives are structural because formal, ESL, and technical writing share statistical patterns with AI output — there is no clean line between them. Liang et al. (Stanford, 2023, arXiv:2304.02819) found GPT detectors misclassified TOEFL essays as AI over 50% of the time.

The most-cited research on AI-detector false positives is Liang et al., “GPT detectors are biased against non-native English writers,” published in Patterns in July 2023 (preprint: arXiv:2304.02819). The study ran TOEFL essays written by non-native English speakers through seven AI detectors — GPTZero, Originality.ai, Crossplag, Sapling, ZeroGPT, Quil, and OpenAI’s now-retired classifier.

The headline finding: the detectors flagged more than half of the TOEFL essays as AI-generated, even though they were entirely human-written. GPTZero misclassified the highest share. When the same students’ essays were rewritten to use more sophisticated vocabulary (raising perplexity), false positive rates dropped dramatically.

The mechanism is exactly the one described above. Non-native English writers tend to use:

  • More common word choices (lower perplexity)
  • More uniform sentence structure (lower burstiness)
  • Higher rates of formal connectors (“furthermore”, “in conclusion”)

That is the same statistical fingerprint AI is trained to produce. The detector cannot tell whether the low perplexity comes from a model or a careful learner.

The same vulnerability applies to:

  • Technical and scientific writing (formal conventions reduce perplexity)
  • Legal writing (template-driven, repetitive structure)
  • Heavily edited content (professional polish smooths variance)
  • Translated text (translation tends to regress toward predictability)
  • Standardized formats (cover letters, lab reports, press releases)

None of these are AI. All of them score AI-like.

How long does text need to be for AI detection to work?

AI detection becomes reliable above roughly 250 words and unreliable below 100 — the classifier needs enough sentences to estimate perplexity and burstiness stably. StealthZero’s Sentrio v2 enforces a 100-word minimum; E.D.I.T.H has no minimum but is less reliable on short text.

Detection accuracy scales with input length. Under roughly 100-150 words, no detector is reliable. The signal — perplexity variance across sentences, burstiness across paragraphs — needs enough sample size to stabilize.

StealthZero’s Sentrio engine enforces a hard 100-word minimum at the API level for exactly this reason. Submitting 30-word snippets returns an HTTP 400 with a “minimum 100 words required” error. E.D.I.T.H, the balanced engine, will run on shorter input but its output confidence drops sharply.

GPTZero, Winston, and Copyleaks all accept shorter input but their public documentation notes that their accuracy claims are based on text >250 words. When you see “99.98% accuracy” in marketing copy, the underlying test is almost always run on documents at least that long.

For a typical use case — checking a 500-word essay, a 1,200-word article, a 3,000-word paper — length is not an issue. For a 50-word LinkedIn post, every detector will hedge and you should not over-interpret the number.

How do the major detection engines compare?

The major detection engines (Turnitin, GPTZero, Winston, Copyleaks, Originality.ai) all measure perplexity and burstiness but disagree on weighting and training data. StealthZero’s Proof Reports bundle Turnitin + GPTZero + Winston + CopyLeaks (4 detectors) in one PDF for cross-detector verification.

These are the engines you will see referenced across the AI detection cluster. The claims are theirs; we link to their pricing or homepage for verification.

GPTZero

  • Claims 99% accuracy on their homepage hero stat
  • Claims 17 million users (hero stat) — footer paragraph says “over 10 million”; both figures are theirs
  • Free tier: 10,000 words/month, 3 Advanced Scans
  • Premium: $12.99/mo billed annually (300,000 words/mo)
  • Founded: January 2023, per their homepage footer
  • Pricing captured 2026-05-28 — see GPTZero pricing

GPTZero pioneered the perplexity-plus-burstiness framing in consumer detection. Our full breakdown of their pipeline lives in How GPTZero Works.

Winston AI

  • Claims 99.98% accuracy (“the only AI detector with a 99.98% accuracy rate” — their homepage)
  • Claims 10M+ users
  • Free tier: 2,000 credits over 14 days, then expires
  • Essential: $10/mo billed annually ($120/year, 80,000 credits/mo)
  • Advanced: $16/mo annual ($192/year, 200,000 credits/mo)
  • Pricing captured 2026-05-28 — see Winston pricing

The 99.98% number is the most aggressive accuracy claim in the category. Independent testing has not corroborated it; the figure comes from Winston’s own internal benchmarks. Detail in our Winston AI review.

Originality.ai

  • Claims to be “the Most Accurate AI Detector” based on studies they cite themselves
  • Claims a patented AI checker (the patent is real and linked from their homepage)
  • Pricing: 1 credit = 100 words; Pro is $12.95/mo billed annually (2,000 credits/mo); Pay-as-you-go is $30 one-time for 3,000 credits
  • Pricing captured 2026-05-28 — see Originality.ai pricing

Originality.ai targets the commercial content / SEO market rather than academia. Our deep dive: Originality.ai review.

Copyleaks

  • Claims over 99% accuracy with an asterisk: “Accuracy rating is based on internal testing of the English language datasets.
  • Founded: 2015 (per their homepage; AI detection added later)
  • Personal — AI Detection only: $13.99/mo billed annually ($16.99/mo monthly)
  • Pro — AI Detection only: $74.99/mo billed annually ($99.99/mo monthly)
  • Credit unit: 1 credit = 250 words (very different from Originality.ai’s 100 words/credit)
  • Pricing captured 2026-05-28 — see Copyleaks pricing

Full comparison: Copyleaks vs GPTZero.

Turnitin

  • Institutional pricing only — Turnitin does not publish consumer pricing
  • Students access through their school’s license; it is not a personal subscription
  • Claims 16,000+ institutions as customers (per their homepage)
  • Founded: 1998 per their About page

Turnitin’s AI detector is bundled into existing Feedback Studio and Similarity licenses. If your professor uses Turnitin, you cannot buy Turnitin directly to verify your own work — which is one reason StealthZero exports a Turnitin-parity Proof Report: same scoring view your instructor sees, available before submission.

StealthZero’s own detectors

StealthZero ships two detection engines, both first-party:

  • E.D.I.T.H (Shield-Lite): Balanced calibration designed to match real-world Turnitin behavior. No minimum word count. Default on the detector tool.
  • Sentrio v2: Stricter proprietary detector with four selectable modes — Standard, Aggressive, Multilingual, and Scholar. Requires a 100-word minimum.

The Free plan includes 600 scans per month at 20 scans per day. Pro ($19.99/mo) and Premium ($29.99/mo) ship with unlimited scans under fair-use, plus monthly Proof Report credits that aggregate Turnitin-parity, GPTZero, Winston, and CopyLeaks into one PDF. Full pricing: StealthZero pricing.

How do you test your writing before someone else does?

Test your writing before submission by running it through StealthZero’s free E.D.I.T.H detector or generating a Proof Report ($2.80 single, included on paid plans). Sentrio v2 Scholar mode (100-word minimum) is the strictest academic check.

The cheapest insurance against a misfired detector is to run the same check yourself before submission. Three options that cost nothing:

  1. GPTZero free tier. 10,000 words/month. Good if your evaluator uses GPTZero specifically.
  2. StealthZero free detector. 600 scans/month at 20/day. E.D.I.T.H engine, no signup beyond an email. Try it.
  3. Both, then compare. If they agree, the score is robust. If they disagree, you know your text sits on a detector boundary.

For a high-stakes submission — a thesis, a published article, a client deliverable — a multi-detector Proof Report ($2.80 single, $12.60 for 5-pack on StealthZero) gives you what your evaluator would see across four detectors in one document.

What do you do when you get flagged?

When you get flagged: preserve your draft and version history immediately, gather supporting evidence, read your institution’s appeal policy, and prepare a calm written timeline. Most institutions resolve flags at the instructor conversation step.

If you have been falsely flagged on human-written work, the playbook is:

  1. Don’t panic and don’t sign anything yet. False positive rates above 10% are documented in peer-reviewed work (Liang et al., 2023).
  2. Pull your version history. Google Docs, Microsoft Word, and most editors keep revision histories. A document with 47 revisions across three days is much harder to explain as “AI dump.”
  3. Request a second-tool check. If your school uses Turnitin, ask whether they will accept a GPTZero or Copyleaks cross-check. Many will, given how public the detector-bias issue has become.
  4. Cite Liang 2023 in your appeal. The paper is peer-reviewed, Stanford-authored, and published in Patterns. It is the strongest single citation against blanket detector trust.

If you used AI assistance and need to humanize the output before submission, the StealthZero humanizer is built specifically to neutralize the signals described in this post — flatten predictable n-grams, restore burstiness, replace AI-typical connectors. The Cohera sub-model achieves 100% bypass on internal testing across all four detectors in the Proof Report.

Where to go next in the cluster

The product side:

Sadasivan et al. 2023 (arXiv:2303.11156) showed that even the strongest AI text detectors degrade toward random-chance accuracy under light paraphrasing attacks, suggesting a theoretical ceiling on reliable detection of high-quality AI text.

References

Frequently Asked Questions

How does an AI detector decide my text is AI?

Most detectors score two signals: perplexity (how predictable each next word is to a reference language model) and burstiness (how much sentence length and complexity vary across the document). A classifier — usually a fine-tuned transformer — combines those signals with stylometric features to output a probability score per sentence and per document.

Why does the same paragraph score 4% AI on one detector and 97% on another?

Each detector uses a different reference model, different training data, and a different classifier threshold. A sentence that has low perplexity against GPTZero's reference model can look normal to Copyleaks' classifier, and vice versa. Detectors disagree most on borderline cases: edited AI output, polished human writing, and ESL writing.

Can AI detectors be wrong?

Yes. A peer-reviewed Stanford study (Liang et al., 2023) found that GPTZero, Originality.ai, Crossplag, Sapling, ZeroGPT, Quil, and OpenAI's classifier all flagged most TOEFL essays from non-native English writers as AI-generated. Detectors are biased toward formal, lower-perplexity prose, which non-native writers produce more often.

What's the difference between detection and plagiarism checking?

Plagiarism checkers compare your text against a database of existing sources and flag matches. AI detectors don't compare against a database — they classify whether the writing pattern looks machine-generated. Most modern integrity tools (Turnitin, Copyleaks, Originality.ai) bundle both.

Do AI detectors work on Claude and Gemini too, or just ChatGPT?

Modern detectors are trained on output from multiple models. GPTZero, Winston, Copyleaks, and Originality.ai all publicly claim coverage of ChatGPT (GPT-3.5/4/4o), Claude, Gemini, DeepSeek and Llama. Accuracy varies by model — newer or less common LLMs tend to evade detection more often than older GPT versions.

Can I check my own work before submitting it?

Yes. StealthZero's detector gives 600 free scans per month with no signup, and exports a PDF Proof Report on every plan from Starter ($9.99/mo) up. The Proof Report includes a Turnitin-parity score plus GPTZero, Winston, and CopyLeaks scores in one PDF, so you can see what every detector would say before you submit.

Ready to Transform Your Content?

Use StealthZero to create undetectable content that passes AI detection every time.

Try StealthZero Free
Share
Joseph Yaduvanshi
Joseph Yaduvanshi

CTO and Co-Founder

Joseph is the CTO and technical co-founder of StealthZero. He leads engineering on the Cohera and Jarvis humanizer models, the multi-detector Proof Reports pipeline, and the Sentrio v2 detector.