GPTZero Improves with Diverse Data, Surpasses Competitor AI Detectors

At GPTZero, our machine learning team has been working to improve our AI accuracy. We understand the importance of minimizing false positives in the classroom to make a comfortable learning environment for students. We have recently made a considerable leap in our progress. We’ve been able to decrease inaccuracies to provide the most seamless AI transparency between teachers and students.

Our Improvements

On August 22nd, we deployed a deep learning model resulting in improved benchmark results. With this new model, our accuracy convincingly surpasses all notable competitor AI detection services. Check out the figure below to see how we stack up compared to our competitors.

AI Detection Accuracy Comparisons Evaluated on Unseen Testing Data

How We Improved Our Results: Diversifying Datasets

How did we achieve these improvements? We find-tuned a new deep learning model using our own diverse validation dataset, also publicly releasing the benchmark dataset for independent third parties to evaluate.

The dataset is more diverse than previous datasets, drawing upon educational text, Q&A, newspaper, and social media posts to train the model. The figure below demonstrates how our AI model correctly predicted human-generated test data as being human-generated and AI-generated data as being AI-generated.

Our likelihood predictions are also more “confident” than before, being in the higher percentage range if we predict a piece is AI-generated. This allows you to be more confident if choosing to have a conversation with a student about AI misuse.