GPTZero

What is the best AI detector for multi-language detection?

We’re excited to share the release of our new multilingual model, expanding AI detection coverage over nine of the world’s most prevalent languages.

Edwin Thomas

Oct 31, 2025 · 4 min read

Fact checked

As AI-generated content spreads across the world's languages, detection must evolve just as quickly. At GPTZero, we treat internationalization as more than just language coverage: our commitment is to ensure our AI detection meets the same high standards in every language as it does in English.

Today, we’re proud to announce the release of our new multilingual model, expanding AI detection coverage over nine of the world’s most prevalent languages and setting the stage for state-of-the-art detections on low-resource languages.

Our latest model achieves industry-leading performance in identifying multilingual AI-synthesized text, with near-perfect recall of ~99%, without flagging human writing.

What’s new in our latest Model?

This is the most accurate multilingual AI detector we’ve ever built. We have upgraded our model architecture to a larger, more enhanced version capable of identifying language-specific linguistic and stylistic nuances more effectively – in both Human and AI writing, allowing us to achieve exceptionally low false positive rates of <0.5% while improving our AI recall to ~99%.

We’ve expanded our training to include support for Arabic, Korean, Japanese, Chinese and Italian languages, while improving on our existing support for French, Spanish, German and Portuguese languages. The data was responsibly mined from diverse text sources across the web with improved LLM coverage over 5+ most widely used LLM model families. By employing language-aware model pipelines, our detector adapts to different language structures, accounting for the fact that the concept of words, spacing and sentence boundaries vary across various languages and linguistic families.

Below is a snapshot of our performance across 20+ non-English languages. GPTZero achieves state-of-the-art detection accuracy across European, Semitic, and Asian languages:

Language	False Positive Rate (%)	AI Recall (%)
Arabic	0.08	99.98
Bahasa	0.3	99.78
Bulgarian	0.08	97.97
Catalan	0.01	99.60
Chinese	0.6	100
Croatian	0.2	99.41
Czech	0.1	96.4
Dutch	1	99.3
French	0.1	99.73
German	0.3	98.29
Greek	0.4	97.9
Korean	0.1	99.79
Polish	0.1	98.56
Portuguese	0	98.89
Romanian	0	98.8
Russian	0.1	98.8
Slovak	0.08	96.79
Slovenian	0.1	98.46
Spanish	0	98.88
Ukrainian	0	96
Vietnamese	0.1	99.88

Table 1: FPR and AI recall of GPTZero Detector split by language.

How do we compare with our competitors?

We evaluated our AI detector against a stratified multilingual benchmarking dataset derived from a sub-sample of our larger dataset covering over 20+ languages (Table 2). Across all detectors, GPTZero is the only one that achieves near-perfect accuracy with a zero false-positive rate. While other detectors trade-off AI accuracy to avoid misclassifications, we achieve the strongest balance between AI recall and False Positive Rate.

Detector	False Positive Rate (%)	True Positive Rate (%)	Accuracy (%)
Copyleaks	2.11	60.78	78.3
Originality	8.19	97.96	91.5
Pangram	0.00	93.79	96.95
GPTZero	0.00	98.57	99

Table 2: GPTZero benchmarked against major competitors Copyleaks, Originality and Pangram on our multilingual benchmarking test set.

By breaking down the results by language (Table 3), we see that GPTZero ranks first in almost all languages with the exception of Croatian and Czech where we came in a close second. We will be refining our model on these and other low-resource languages in the upcoming releases.

Table 3: Benchmark results of GPTZero with competitors split by language

We also carried out a qualitative review of representative samples across languages where we captured classification differences directly from all detector’ dashboards. The examples below (Fig. 1-3) show typical misclassification errors observed in competing models (FPs for Originality and FNs for Pangram and Copyleaks) and how GPTZero got it right.

Fig 1. An example Chinese human-written text misclassified as AI by Originality.ai (top) but correctly identified by GPTZero (bottom)

Fig 2. An example Arabic AI generated text misclassified as human by pangram (top) but correctly identified by GPTZero (bottom)

Fig 3. An example Spanish AI generated text misclassified as human-written by copyleaks (top) but correctly identified by GPTZero (bottom)

Together, these results reinforce GPTZero’s position as the most reliable multilingual detector in the market today and lay the groundwork for what's ahead.

What’s Next?

We believe trust in content should not be based on what language you speak and to this end our goal remains the same: making AI detection universal, reliable and fair. We look forward to further improving our AI detection capabilities over low-resource languages while deepening our understanding of linguistic diversity.

GPTZero

What’s new in our latest Model?

How do we compare with our competitors?

What’s Next?

Written by Edwin Thomas

Keep reading

GPTZero versus ZeroGPT: What's the difference?

Turnitin vs GPTZero: Key Differences for AI Detection in Education (2026 Guide)