AI Detection · guides
How GPTZero Works: AI Detection Technology (2026)
How GPTZero detects AI content. Learn about perplexity, burstiness, and how to write content that passes.
GPTZero is one of the most-used AI detectors in education today. If you’ve submitted a paper anywhere in the last two years, there’s a reasonable chance it ran through GPTZero or a tool built on similar ideas. This post walks through how the detector actually works, what signals it looks for, and what that means for your writing.
Which StealthZero model to use against which detector
Detector choice drives model choice. F.R.I.D.A.Y is fine-tuned against the latest GPTZero model; Jarvis-Cohera and Jarvis-Max hit 100% Turnitin bypass in internal testing; Sentinel-Lite and Sentinel-Max are the SEO-targeted family.
| Detector / use case | Use this model |
|---|---|
| Latest GPTZero (fine-tuned) | F.R.I.D.A.Y |
| Turnitin (100% bypass, internal testing) | Jarvis-Cohera or Jarvis-Max |
| SEO content (blog, web copy) | Sentinel-Lite or Sentinel-Max |
| General AI detection (Free tier) | Origin (may need multiple passes for strict detectors) |
| Quality + tone control | Jarvis-Cohera |
Origin (Free) bypasses general AI detection, but for strict detectors like Turnitin or GPTZero, use F.R.I.D.A.Y or J.A.R.V.I.S (Cohera or Max).
Detector benchmarks and StealthZero coverage
StealthZero runs two in-house detectors (E.D.I.T.H and Sentrio v2) and bundles four third-party detectors into Proof Reports. Sentrio v2 ships four modes and enforces a 100-word minimum. Free tier covers 600 scans per month.
- E.D.I.T.H (Shield-Lite): calibrated to match real-world Turnitin scores, no minimum word count
- Sentrio v2: four modes (Standard, Aggressive, Multilingual, Scholar), 100-word minimum, claims 99%+ accuracy
- Proof Reports: Turnitin + GPTZero + Winston + CopyLeaks (4 detectors per report)
- Pricing: $2.80 single Proof Report, $12.60 5-pack (10% off), $22.40 10-pack (20% off)
- Free tier: 600 scans/month; Pro and Premium: unlimited (fair use)
- Liang et al. 2023 (arXiv:2304.02819) measured false-positive rates above 60% for ESL writers across multiple GPT detectors
Weber-Wulff et al. 2023 (Int J Educ Integr 19:26) benchmarked 14 detection tools and found none reached the accuracy needed to be considered reliable in academic integrity workflows — most tools either over-flagged human writing or missed machine-paraphrased AI text.
What is the science behind AI detection?
AI detection rests on a statistical fact: transformer language models produce low-perplexity, low-burstiness, stylistically uniform text by training-objective design. Classifiers learn the difference between those patterns and the higher-variance patterns in human writing.
Core Detection Metrics
GPTZero analyzes text using two primary metrics developed by its Princeton-based team:
1. Perplexity
What it measures: How “surprised” a language model would be when reading your text.
How it works:
- AI language models predict the next word based on previous words
- When text follows predictable patterns, the model isn’t “surprised” = low perplexity
- When text takes unexpected turns, the model is “surprised” = high perplexity
Human writing typically has higher perplexity. Humans make unexpected word choices, vary sentence structures, include personal quirks and style, and sometimes make errors or unconventional choices.
AI writing typically has low perplexity. Language models select statistically likely word combinations, follow predictable patterns, use consistent phrasing, and avoid risk by reaching for common expressions.
2. Burstiness
What it measures: The variation in sentence complexity throughout a text.
How it works:
- Humans naturally write with varied rhythm—some sentences long and complex, others short and punchy
- AI tends to maintain consistent complexity levels throughout
- GPTZero measures this variation as “burstiness”
High burstiness (human-like):
“The experiment failed. Completely, utterly failed. We had spent six months preparing, triple-checking every variable, consulting with experts across three continents, and still, when the moment came, the results defied everything we’d predicted.”
Low burstiness (AI-like):
“The experiment did not produce the expected results. The research team had invested significant time in preparation. They had verified all variables carefully. They had sought advice from multiple experts. Nevertheless, the outcomes were different from predictions.”
Secondary Analysis Factors
Beyond perplexity and burstiness, GPTZero examines:
Sentence Structure Patterns
- Repetitive grammatical constructions
- Overuse of certain transitional phrases
- Consistent paragraph structures
Vocabulary Distribution
- Unusual word frequency patterns
- Overuse of “hedging” language (may, might, could)
- Specific phrases AI models favor
Coherence Patterns
- How ideas connect across paragraphs
- Topic transitions
- Argument development flow
How does GPTZero’s detection process work?
GPTZero scores text in two passes: a perplexity model evaluates each word’s predictability, and a burstiness model evaluates sentence-level variation; both feed into a final AI probability. GPTZero claims 99%+ accuracy with a 10,000 words/month free tier.
Step 1: Text Preprocessing
When you submit text, GPTZero:
- Removes formatting and special characters
- Normalizes whitespace
- Segments text into analyzable chunks
- Prepares the text for model analysis
Step 2: Feature Extraction
The system extracts numerous features:
- Per-sentence perplexity scores
- Burstiness measurements
- N-gram frequency analysis
- Syntactic pattern recognition
- Vocabulary richness metrics
Step 3: Model Classification
GPTZero’s classifier (trained on millions of human and AI text samples) processes these features to generate:
- Overall probability score: Percentage likelihood of AI generation
- Sentence-level highlighting: Which specific sentences appear AI-generated
- Confidence level: How certain the model is about its classification
Step 4: Report Generation
The final report includes:
- Overall AI probability percentage
- Highlighted suspicious sections
- Breakdown of human vs. AI-likely passages
- Confidence indicators
What makes text “detectable” by GPTZero?
Text becomes ‘detectable’ when it sits inside the AI cluster on perplexity (low) and burstiness (low) simultaneously — raw GPT output, lightly-edited prose, and some legitimate uniform writing all sit there. Detectability is statistical, not content-based.
AI Writing Fingerprints
GPTZero looks for these telltale AI patterns:
1. The “Perfect” Opening AI often starts with overly smooth introductions. The classic example:
“In today’s rapidly evolving digital landscape, understanding [topic] has become more important than ever.”
2. Formulaic Transitions Watch for repeated use of:
- “Furthermore”
- “Additionally”
- “It’s worth noting that”
- “In conclusion”
3. Hedging Language Overuse AI frequently qualifies statements:
“This may potentially be attributed to various factors, including but not limited to…”
4. Balanced Paragraph Structure AI tends to create symmetrical arguments:
“On one hand… On the other hand… Ultimately…”
5. Generic Examples AI provides broad, universally applicable examples rather than specific, personal ones.
Why Human Writing Differs
Authentic human writing contains:
Natural Imperfections
- Occasional grammar variations
- Colloquialisms and slang
- Sentence fragments for emphasis
- Run-on sentences when excited
Personal Voice
- Unique metaphors and analogies
- Specific lived experiences
- Emotional reactions
- Opinions and biases
Structural Variance
- Paragraphs of very different lengths
- Unexpected topic shifts
- Non-linear arguments
- Tangents and asides
What are GPTZero’s limitations?
GPTZero’s documented limitations: false positives on ESL writing (Liang et al., Stanford 2023, arXiv:2304.02819), unreliable scores under ~250 words, and lag behind the latest LLM releases. Treat GPTZero scores as one signal among several.
Understanding what GPTZero gets wrong helps calibrate expectations:
False Positives
GPTZero sometimes flags legitimate human writing, especially:
Non-native English speakers: Those who learned formal, textbook English often write with patterns similar to AI output.
Technical/scientific writing: Academic conventions can mirror AI’s preference for clarity and consistency.
Heavily edited content: Professional editing that smooths out natural variation can trigger detection.
Template-based writing: Cover letters, business emails, and formulaic documents often appear AI-like.
False Negatives
GPTZero may miss AI content that has been:
Humanized text: Tools like StealthZero rewrite AI patterns into more varied, human-like prose.
Extensively revised: Human editing that adds variance and personal touches.
Generated with specific prompts: Some prompting techniques produce more human-like output.
Mixed with human content: Documents that blend AI and human sections.
How do you write human content that passes GPTZero?
Write human content that passes GPTZero by varying sentence length deliberately, using lower-probability word choices, and switching register between paragraphs. If using AI assistance, run the draft through a detector-targeted humanizer (StealthZero Cohera reaches 100% bypass in internal testing).
Strategy 1: Embrace Your Voice
Don’t try to write “perfectly.” Let your natural style through:
- Use contractions (don’t, can’t, won’t)
- Include opinions (“I think,” “In my experience”)
- Vary your sentence length dramatically
- Allow some conversational tangents
Strategy 2: Add Specificity
Replace generic statements with specific ones:
AI-like: “Many people find this challenging.” Human-like: “My neighbor Frank spent three weekends trying to figure this out before giving up and calling a professional.”
Strategy 3: Break Patterns
Consciously vary your writing:
- Start some paragraphs with questions
- Use a one-word sentence occasionally
- Include an aside (like this one)
- Let some ideas remain partially developed
Strategy 4: Include Human Elements
Add content AI can’t generate:
- Personal anecdotes
- Specific dates and names
- Sensory descriptions
- Emotional reactions
- Humor (AI humor is notoriously flat)
How does StealthZero address GPTZero’s detection?
StealthZero addresses GPTZero detection by tuning rewrites on the same perplexity-burstiness signals GPTZero scores: the Cohera model targets exactly the statistical patterns GPTZero looks for. Cohera reaches 100% bypass in internal testing; verify with the four-detector Proof Report.
The StealthZero humanizer rewrites text along the same dimensions GPTZero measures:
- Perplexity: introduces less-predictable word choices and varies phrasing so the text doesn’t sit in the statistical sweet spot AI models produce
- Burstiness: mixes short and long sentences and varies sentence complexity across paragraphs
- Pattern breaking: replaces AI-typical transitions and structural tics with more varied alternatives
The Cohera model (a Jarvis sub-model on StealthZero) achieves 100% bypass against GPTZero in internal testing. The base humanizer flow targets a 99% pass rate. Both numbers are based on internal testing against current detector versions; detector behavior changes over time, so we recommend verifying each draft before submission.
What’s the future of AI detection?
The future of AI detection is a moving target: detectors retrain on new LLMs, LLMs improve at producing human-like text, and humanizer models track both. StealthZero re-verifies bypass rates monthly to track this churn.
Detection and the tools that work around it both keep improving:
- Detectors get more training data from newer models, refine their handling of edge cases, and tune false-positive rates down (slowly)
- Humanizers get better at modeling the variation real human writing has, and AI models themselves get better at producing more varied output
The arms race will keep moving. The most reliable habit is to verify before you submit, rather than trust either side to be stable across releases.
Wrapping Up
GPTZero works by measuring perplexity, burstiness, and linguistic patterns to decide whether text is AI-generated. It’s effective on raw AI output and weaker on content that has been rewritten with attention to those same signals. It also produces false positives on certain styles of human writing.
The right approach depends on the situation:
- For original work: write in your own voice, don’t sand off your stylistic quirks
- For AI-assisted content: use a humanizer that targets perplexity and burstiness, like the StealthZero humanizer
- For verification: check your content before submission to see the same signals the detector will see
For more detail on the underlying ideas, see our explainers on perplexity in AI detection and burstiness.
Technical information in this article is based on published research and StealthZero’s internal testing. Detection technology evolves rapidly. Last updated 2026-05-28.
Sadasivan et al. 2023 (arXiv:2303.11156) showed that even the strongest AI text detectors degrade toward random-chance accuracy under light paraphrasing attacks, suggesting a theoretical ceiling on reliable detection of high-quality AI text.
References
- Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). “GPT detectors are biased against non-native English writers.” arXiv:2304.02819. https://arxiv.org/abs/2304.02819
- Sadasivan, V. S., Kumar, A., Balasubramanian, S., Wang, W., & Feizi, S. (2023). “Can AI-Generated Text Be Reliably Detected?” arXiv:2303.11156. https://arxiv.org/abs/2303.11156
- Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S., et al. (2023). “Testing of detection tools for AI-generated text.” International Journal for Educational Integrity, 19(1). https://doi.org/10.1007/s40979-023-00146-z
Frequently Asked Questions
How accurate is GPTZero?
GPTZero claims very high accuracy on AI-generated text from models like ChatGPT and Claude, but in practice accuracy varies depending on the model the text came from and how it was edited. False positives are most common on non-native English writing, technical and scientific text, and heavily edited content that follows formal academic conventions.
Can GPTZero detect paraphrased AI content?
GPTZero can often detect lightly paraphrased AI content because its analysis focuses on underlying patterns, not just word choice. However, thoroughly rewritten content that changes sentence structures, adds human variance, and includes original thought typically evades detection.
What is perplexity in AI detection?
Perplexity measures how 'surprised' a language model would be by a piece of text. AI-generated text has low perplexity (predictable patterns), while human writing has higher perplexity (unexpected word choices and structures). GPTZero uses perplexity as a key detection metric.
Does GPTZero work on all AI models?
GPTZero is trained to detect content from major AI models including ChatGPT (GPT-3.5/4), Claude, Gemini, LLaMA, and others. Detection accuracy may vary between models, with newer or less common AI systems sometimes evading detection.



