GPTZero’s Massive AI Detector Update for Summer 2025
GPTZero’s new Model 3.7b improves drastically on the latest language models from the top providers, just in time for the new 2025/2026 school year.

The machine learning team at GPTZero spent the summer building our best AI detector ever. This release comes just in time for the school year, when students and teachers will be able to rely on GPTZero to encourage responsible AI use in the classroom. Our latest AI detection model improves drastically on the latest language models from the top providers, and generalizes to GPT5 without explicit training.
This release includes a complete overhaul of our training data. Our goal was to significantly improve our performance on the language models that our users care about. We focused on the top models for academic API use, and models that are widely available with free and paid accounts from the top providers.
Training on the latest language models
In our latest model release, Model 3.7b, we’ve updated the majority our training data, including more AI documents from:
- OpenAI – GPT4.1,GPT4.1-mini, o3, o3-mini
- Gemini – 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite
- Claude – Sonnet 4
- And a few other models
These models have made big leaps in many areas including reasoning, creative writing, and context capabilities, which result in more complex and sometimes more human-like text.
Our performance on these models is shown in Table 1. On one of the reasoning models in particular, the improvement in recall at 1% false positive rate was over 40% with our latest AI detector release.
Table 1: AI detector performance on the popular LLMs with our latest AI detector. Recall is the percentage of AI-generated documents in the benchmark that our detector correctly identified as AI. We report recall at 1% false positive rate.
Harder datasets and prompts
Some AI-generated text is simple to spot while other text is created deliberately to bypass detectors. To deal with this, we expanded the scope of our training data, adding new text domains and prompts to our database.
In this release, we trained on:
- Complex, web-sourced, information dense AI data like OpenAI’s deep research
- Human text with edits from common grammar correction apps
At our most recent offsite, our machine learning engineers Edwin and Nazar went even further by training generative models to find new prompts that bypass our AI detection model. They used reinforcement learning algorithms to identify which prompting techniques generate text that looks the most human-written to our detector. We generated and trained on new AI-written documents created with the language models listed above and these new prompts.

Our baseline performance on GPT5 models
With the release of GPT5, we wanted to see how well the GPTZero AI detector can generalize to a newer, advanced language model. We found that with these updates, our performance without training on GPT5 data is significant, with 95% recall on our new benchmark. Our benchmarks on GPT5 models are shown in Table 2.
Please note, our detection model has NOT been trained to detect GPT-5 text. We expect these results to only improve in the coming weeks.
Table 2: AI detector performance on GPT5 models with our latest AI detector not trained on any GPT5 data
With this update, you’ll see improved performance on all of the best LLMs available. Our AI detection capabilities closely follow AI model releases as the field of AI detection continues to grow. No matter what you use GPTZero for, our generalization on GPT5 models is a result of our continued efforts to have a robust model that supports you in academic settings as well as in everyday life.