GPTZero's AI Detection Technology

As pioneers in AI detection, GPTZero incorporates the latest research in detecting ChatGPT, GPT4, Google-Gemini, Llama, and new AI models, and investigating their sources.

How AI Detection at GPTZero works

GPTZero’s technology uses deep learning to keep pace with AI advancements to deliver precise, reliable results that help you understand and interpret the origin of a piece of text.

Input Text

GPTZero accepts copy and pasted text, docx, pdf, and image files, analyzing up to 50 files at a time.

Deep Learning

We employ an end-to-end deep learning approach, trained on text datasets from the web, education, and AI- generated from a range of LLMs.

Sentence Classifier

A sentence-by-sentence classification model determines the probability and confidence that a text was created by AI.

Paraphraser Shield

We defend against tools looking to exploit AI detectors. Our model shields against common methods to bypass AI detection, such as paraphrasing and homoglyph attacks.

Output Result

You can view easy-to-interpret results in our dashboard, with premium features to detect AI vocabulary, plagiarism, and citeable sources.

Leading the way in AI Detection Research

It is becoming increasingly critical to develop robust tools to detect AI-generated texts and limit the adverse effects of LLMs. GPTZero’s mission is to ensure that human-authored and LLM-generated text remains distinguishable. We achieve this goal by offering a commercially available AI detector that is highly accurate, scalable, and – most importantly – capable of delivering explainable predictions that allow users to responsibly interpret the results.

Our wider research contributions include:

We frame the LLM-generated text detection as a trinary classification problem, separating prediction confidence from the proportion of LLM text.

We developed the first sentence highlighting model using HMM (Hidden Markov Models) for areas of text, featured on Anderson Cooper 360.

We developed a novel output mapping mechanism which improves model calibration and biases the detector to prefer making less-harmful false-negative errors over false-positive errors.

We continuously demonstrate superior AI detection performance against both commercial and open-source alternatives across multiple genres and languages.

We outlined an industrial-scale framework for collecting and cleaning data, training and utilizing supervised-models, and considerations on user interaction with the models.

Cyclical Development Process of our Deep Learning Model

GPTZero's TOEFL Classification pie chart

De-biasing Detection for Education

Our team is dedicated to de-biasing our AI classification models for educational use cases.

For example, our efforts in reducing ESL bias in classification since April 2022 have reduced AI detection’s false positive rate on TOEFL texts to 1.1%.

We achieved our successful de-biasing via several methods, including model parameter tagging that incorporated an “education” tag in model training, text preclassification at the model output step, and representative dataset insertions. Through training a classification model, we can predict beforehand whether a text is likely from an ESL writer, to ensure the AI identification model has this information when making a classification.

Confidence Scores

We were the first detector to provide confidence categories for our classifications: “uncertain,” “moderately confident,” and “highly confident.” These categories are tuned so that the average error rate is less than 1% for the “high” confidence predictions, based on a diverse evaluation dataset used internally that was never before seen by the model.

Average error rate is emphasized because the number of possible documents is vast, varying substantially in tone, content, length, grammatical correctness, logical coherence, and structure.

Mixed Classification

GPTZero was the first detector to include a classification of “mixed” human and AI content. Our model outputs 3 possible classifications instead of the normal binary (human vs. AI):

written entirely by a human
written entirely by an AI
written by a mix of human and AI

This allows for a more nuanced AI detection result.

Advanced Scan

Our state-of-the-art advanced AI detection model offers an unprecedented level of analysis to identify which sections of writing contribute most to our AI detection, helping you understand why a document is classified as AI.

Our Approach to Benchmarking

We are strongly supportive of the work of independent and academic reviewers in evaluating the progress of AI models.

We provide free API access to our model upon request for academic researchers. We’ve been evaluated by researchers from MIT, Harvard, Stanford, and several other universities.

From internal and external benchmarking, we find GPTZero is much better than our competitors at detecting mixed documents where both AI and human writing is involved, with a 96.5% accuracy rate.

False Positives

A false positive in AI detection is when an AI detector incorrectly classifies a human’s writing as AI. If, for instance, you are an educator or an institution that relies on AI detection tools to help inform your disciplinary policy around students’ AI usage, you will want to make sure the false positive rate is as low as possible to avoid false claims of cheating. We keep GPTZero’s false positive rate at no more than 1% when evaluating AI versus human text.

Researchers using GPTZero for AI detection

Join your fellow researchers using GPTZero for their papers, publications, and investigations.

Get scholarly access

The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates

Giuseppe Russo Latona, Manoel Horta Ribeiro, Tim R. Davidson, Veniamin Veselovsky, Robert West

"We estimate that 15.8% of ICLR reviews in 2024 were crafted with the assistance of an LLM, or 4,428 of the 28,028 reviews submitted that year; 49.4% of all submissions received at least one review classified as AI-assisted by GPTZero."

Analysing the impact of ChatGPT in research | Applied Intelligence

Pablo Picazo-Sanchez & Lara Ortiz-Martin

"In other words, no matter which editorial the analysed text comes from, the detector with the highest accuracy is GPTZero."

Characterizing the Increase in Artificial Intelligence Content Detection in Oncology Scientific Abstracts From 2021 to 2023 | JCO Clinical Cancer Informatics

Frederick M. Howard, Anran Li, Mark F. Riffon, Elizabeth Garrett-Mayer, and Alexander T. Pearson

"GPTZero had the best discrimination of the pure AI-generated abstracts at an optimal threshold selected with Youden’s index, identifying 99.5% of AI-written abstracts with no false positives among human-written text. AI, artificial intelligence."

The Rise of AI-Generated Content in Wikipedia

Creston Brooks, Samuel Eggert, Denis Peskoff

"Using two tools, GPTZero and Binoculars, we detect that as many as 5% of 2,909 English Wikipedia articles created in August 2024 contain significant AI-generated content."

Other papers using GPTZero

Generative AI in Financial Reporting by Elizabeth Blankespoor, Ed deHaan, Qianqian Li :: SSRN

Identification of ChatGPT-Generated Abstracts Within Shoulder and Elbow Surgery Poses a Challenge for Reviewers - ScienceDirect

Detecting AI-Generated Writing Using GPTZero

Artificial Intelligence–Generated Writing in the ERAS Personal Statement: An Emerging Quandary for Post-graduate Medical Education | Academic Psychiatry

Quality and correctness of AI-generated versus human-written abstracts in psychiatric research papers

Exploring AI Generated Text in Public Company Earnings Calls: A Comparative Analysis by John Garvey, Fergal O’Brien, Daire Campbell :: SSRN

Human vs machine: identifying ChatGPT-generated abstracts in Gynecology and Urogynecology - ScienceDirect

Detecting Generative AI Usage in Application Essays

Frontiers | Students are using large language models and AI detectors can often detect their use

Breaking News: Case Studies of Generative AI’s Use in Journalism

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection

Authorship Obfuscation in Multilingual Machine-Generated Text Detection

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

General FAQs about our AI Detector

Everything you need to know about GPTZero and our chat gpt detector. Can’t find an answer? You can talk to our customer service team.

Can’t find an answer? You can talk to our customer service team.

What is GPTZero?

GPTZero is the leading AI detector for checking whether a document was written by a large language model such as ChatGPT. GPTZero detects AI on sentence, paragraph, and document level. Our model was trained on a large, diverse corpus of human-written and AI-generated text, with support for English, Spanish, French, German, and other languages. To date, GPTZero has served over10 million users around the world, and works with over 100 organizations in education, hiring, publishing, legal, and more.

How do I use GPTZero?

Simply paste in the text you want to check, or upload your file, and we'll return an overall detection for your document, as well as sentence-by-sentence highlighting of sentences where we've detected AI. Unlike other detectors, we help you interpret the results with a description of the result, instead of just returning a number.

To get the power of our AI detector for larger texts, or a batch of files, sign up for a free account on our Dashboard.

If you want to run the AI detector as your browse, you can download our Chrome Extension, Origin, which allows you to scan the entire page in one click.

When should I use GPTZero?

Our users have seen the use of AI-generated text proliferate into education, certification, hiring and recruitment, social writing platforms, disinformation, and beyond. We've created GPTZero as a tool to highlight the possible use of AI in writing text. In particular, we focus on classifying AI use in prose.

Overall, our classifier is intended to be used to flag situations in which a conversation can be started (for example, between educators and students) to drive further inquiry and spread awareness of the risks of using AI in written work.

Does GPTZero only detect ChatGPT outputs?

No, GPTZero works robustly across a range of AI language models, including but not limited to ChatGPT, GPT-5, GPT-4, GPT-3, Gemini, Claude, and AI services based on those models.

Why GPTZero over other detection models?

GPTZero is the most accurate AI detector across use-cases, verified by multiple independent sources, including TechCrunch, which called us the best and most reliable AI detector after testing seven others.
GPTZero builds and constantly improves our own technology. In our competitor analysis, we found that not only does GPTZero perform better, some competitor services are actually just forwarding the outputs of free, open-source models without additional training.
In contrast to many other models, GPTZero is finetuned for student writing and academic prose. By doing so, we've seen large improvements in accuracies for this use-case.

Lastly, many of our users - especially educators - have told us they trust GPTZero because we have only one mission: provide every human with the tools to detect and safely adopt AI technologies. Unlike many providers who recently released detectors as a side product, this mission will always be our number one priority.

What are the limitations of AI Detection?

The nature of AI-generated content is changing constantly. As such, these results should not be used to punish students. We recommend educators to use our behind-the-scene Writing Reports as part of a holistic assessment of student work. There always exist edge cases with both instances where AI is classified as human, and human is classified as AI. Instead, we recommend educators take approaches that give students the opportunity to demonstrate their understanding in a controlled environment and craft assignments that cannot be solved with AI.

The accuracy of our model increases as more text is submitted to the model. As such, the accuracy of the model on the document-level classification will be greater than the accuracy on the paragraph-level, which is greater than the accuracy on the sentence level.

The accuracy of our model also increases for text similar in nature to our dataset. While we train on a highly diverse set of human and AI-generated text, the majority of our dataset is in English prose, written by adults.

Our classifier is not trained to identify AI-generated text after it has been heavily modified after generation (although we estimate this is a minority of the uses for AI-generation at the moment).

Currently, our classifier can sometimes flag other machine-generated or highly procedural text as AI-generated, and as such, should be used on more descriptive portions of text.

What data did you train your model on?

Our model is trained on millions of documents spanning various domains of writing including creating writing, scientific writing, blogs, news articles, and more. We test our models on a never-before-seen set of human and AI articles from a section of our large-scale dataset, in addition to a smaller set of challenging articles that are outside its training distribution.

How do I use and interpret the results from your API?

To see the full schema and try examples yourself, check out our API documentation.

Our API returns a document_classification field which indicates the most likely classification of the document. The possible values are HUMAN_ONLY, MIXED, and AI_ONLY. We also provide a probability for each classification, which is returned in the class_probabilities field. The keys for this field are human, ai or mixed. To get the probability for the most likely classification, the predicted_class field can be used. The class probability corresponding to the predicted class can be interpreted as the chance that the detector is correct in its classification. I.e. 90% means that 90% of the time on similar documents our detector is correct in the prediction it makes. Lastly, each prediction comes with a confidence_category field, which can be high, medium, or low. Confidence categories are tuned such that when the confidence_categoryfield is high 99.1% of human articles are classified as human, and 98.4% of AI articles are classified as AI.

Additionally, we highlight sentences that been detected to be written by AI. API users can access this highlighting through the highlight_sentence_for_ai field. The sentence-level classification should not be solely used to indicate that an essay contains AI (such as ChatGPT plagiarism). Rather, when a document gets a MIXED or AI_ONLY classification, the highlighted sentence will indicate where in the document we believe this occurred.

Are you storing data from API calls?

No. We do not store or collect the documents passed into any calls to our API. We wanted to be overly cautious on the side of storing data from any organizations using our API.

However, we do store inputs from calls made from our dashboard. This data is only used in aggregate by GPTZero to further improve the service for our users. You can refer to our privacy policy for more details.