Deep Learning Model Updates
ML Updates:
In June, we developed a model that was spades better than any previously released models when evaluated on GPT4 and human data. The updated model combined our previous architecture and statistical discriminators approaches with a traditional deep learning approach that was critical in reducing false negatives for GPT4. After validating the performance of this model on challenging real-world examples, we deployed the model to production.
The new model was evaluated both on general, and specifically on a GPT4 dataset we built. Between May and June, we also added additional educational, K12, college, and other writing data to make our evaluation dataset more robust and difficult for detection.
As a general principle, our model evolves rapidly, and is updated bi-weekly. Today, our model combines a mixture of standard and novel detection techniques to produce a more robust and accurate detector.
The most important development for GPTZero in June was incorporating a ‘deep learning’ approach. Here we build an end-to-end ML pipeline, trained on both massive text corpuses from the web, education datasets from our partners and also synthetic AI datasets generated from a range of language models, including most recently Llama (developed by Facebook) and GPT4. Our deep learning approach is a long term investment, that differentiates us from that majority of AI detection layers, in building a more robust model that can improve with AI improvements.
Figure 1: AI Detection accuracies of GPTZero's updated '2023-06-30' model evaluated for GPT4 Data in June, 2023
The graphs above display the prediction accuracies of the upgraded GPTZero detection model, the "2023-06-30" model, evaluated on GPT4 generated data and paired human data. The upgraded model demonstrates a 0.95 AUC compared to a previous 0.85 AUC in a challenging GPT4 dataset, a task more difficult to predict than previous GPT2, Llama, and GPT3 datasets. It was able to distinguish more than 60% of GPT4 in the wild while maintaining at 99% accuracy in classifying human data as human. Maintaining our low false positive is a key differentiator from other competitors that are more likely to label a text as AI in general.