AI text detectors are tools meant to identify content written by artificial intelligence. As AI writing gets better, and some prompt wizards are able to generate output practically indistinguishable from human writing, the best AI content detectors still promise to help.
These tools are definitely on the rise. Some researchers reported the estimated AI content detectors market size at $25.13 billion in 2023, expected to reach $255.74 billion by 2032. These days everyone is expected to be more and more productive. Subsequently, the use of AI tools is increasing to match those expectations. And then suddenly various entities require people who just got comfortable with this new tech to stop the use of AI altogether for different reasons, could it be a growing pool of factual mistakes, questions about AI ethics, or just a belief that 100% of content generated with the help of AI is bad. So it’s not surprising to see these numbers.
Most AI content detectors have a simple interface: paste your text, click a button, and get the verdict: AI or not AI. Some provide detailed breakdowns of which sections of the text impacted the judgment results the most. Some tools will offer API access and support multiple languages.
Each of the AI text detector tools has its own way of identifying AI. It’s not some kind of quiz asking if you are a robot, but rather algorithms and rules analyzing writing patterns, grammar, choices of words, and the way the sentences are written. The combined results allow the tools to make an educated guess about whether or not the text was written by a machine. But let’s get to the nitty gritty: the basics of AI detection algorithms.
In a nutshell, AI detectors are looking for specific patterns and characteristics in a text and those often can help differentiate AI content from human-written text. Every tool has its own unique proprietary blend of technologies to look at, but there are some common concepts used for AI detection.
One of them is perplexity. It measures the unpredictability of the text. AI-generated content usually has lower perplexity, meaning that it’s easier to predict which word comes next. Human writing tends to be more perplexing, include more curious word choices, and boast with occasional typos (sure, it’s not something to be proud of under normal circumstances, but in the age of AI, surprisingly it is).
Another concept these tools use is burstiness, and it’s related to variation in sentence length and structure. AI text usually has lower burstiness, with all sentences being of similar length. Human writing usually shows higher burstiness and sometimes comes with excessive love for passive voice and weirdly long sentences.
Based on analysis of these and other factors, most AI detection tools provide a probability score of whether the text is human or machine-generated. But as mentioned before, all AI detectors are different and as AI writing gets better (and users learn how to use it more efficiently), AI detection developers should always make adjustments to keep up. Ultimately it means that no tool is 100% foolproof on how to detect AI writing.
But is there a tool that is clearly superior to others? Or are all pretty much equal? I decided to put it to the test. Let’s dive into our findings and see how the AI detection tools performed against each other in real-world scenarios.
The tool selection process included cross-referencing multiple “best AI detector” rankings on Google. I chose tools that appeared more often across these lists. I also included a few tools that were not on those lists but showed up on top of Google search results for the query “best AI tool.”
The main focus of this comparison is accuracy, but pricing and functionality have been also considered. All tools were tested using their free versions. (Some tools were not generous enough with their free plans, so I had to be sneaky with creating new accounts and clearing cache and cookies).
To really put these detectors to the test, I fed them different types of AI-generated, human-edited, and human texts. I started with simple ChatGPT and Claude outputs, then progressed to more complex prompts including rewriting texts to bypass detection, outputs mimicking well-known human content creators, and prompt training on distinctive human writing styles. I also tested human-edited AI outputs (it was a substantial amount of effort, about 50% of the initial text was edited) and fully human-written content.
You can see all the text samples in the slideshow below:
For those who came to know what the results are, here’s a detailed breakdown of the tools’ performance, where the % in the table is the share of human writing predicted by each tool.
Task | Winston Al | Originaliy.Al | GPTZero | ZeroGPT | Smodin | Hive | QuillBot |
Basic prompt ChatGPT | 0% | 0% | 0% | 100% | 0% | 0.01% | 0% |
Basic prompt Claude | 0% | 0% | 0% | 19% | 0% | 0.01% | 0% |
Rewrite to bypass Al detection ChatGPT | 0% | 1% | 0% | 70% | 0% | 100% | 0% |
Rewrite to bypass Al detection Claude | 0% | 100% | 0% | 53% | 0% | 0.01% | 0% |
Rewrite in the style of a well-known human ChatGPT | 0% | 36% | 0% | 100% | 0% | 100% | 0% |
Rewrite in the style of a well-known human Claude | 0% | 95% | 24% | 84% | 24% | 99.3% | 0% |
Write based on samples of very distinct writing of an actual human ChatGPT | 0% | 0% | 0% | 0% | 0% | 100% | 0% |
Write based on samples of very distinct writing of an actual human Claude | 0% | 100% | 12% | 84% | 0% | 100% | 0% |
Output written to bypass Al detection with human edits ChatGPT | 6% | 97% | 5% | 85% | 100% | 100% | 37% |
Output written to bypass Al detection with human edits Claude | 4% | 100% | 81% | 100% | 100% | 89.9% | 100% |
Human Text | 100% | 100% | 96% | 100% | 100% | 100% | 100% |
And if you want to learn about each tool in greater detail, please, read the full report below.
Winston AI is a full-scope AI content detection tool that offers lots of amazing features: AI text and image detection, plagiarism checker, and even handwriting analysis. It stands out for its user-friendly interface and generous free tier. And, as it turned out, pretty good accuracy.
How AI detection works in this tool:
Winston AI uses a combination of data training, linguistic analysis, and algorithms for pattern recognition. It looks at things like perplexity and burstiness to determine if content is likely to be produced by AI. It gives a “Human Score” showing the probability of human authorship and features an AI prediction map indicating sections of the texts that were potentially generated by AI.
Pricing:
Winston AI offers a free tier with 2,000 credits (1 credit per word) and a 14-day trial of premium features. Paid plans applicable for teams start at $29/month (or $19/month if paid annually). This tier includes 200,000 credits/month and features like more sophisticated plagiarism detection and shareable PDF reports.
Accuracy test results:
Winston AI performed well in identifying both AI and human-written content. However, it showed some inconsistencies:
Overall, Winston AI seems to be a great tool with decent accuracy, but users should be aware of potential inconsistencies in reporting, especially when dealing with human-edited content.
Originality.AI is another AI content detection tool for serious publishers that claims to hit a 99% accuracy rate. It offers AI detection, plagiarism checking, and readability analysis catering to teams that manage large volumes of content.
How AI detection works in this tool:
Originality.AI doesn’t exactly explain how their detection works, but they claim to identify texts generated by ChatGPT, GPT-4, Claude, Llama, and Gemini. They also mention that one of the reasons why their tool outperforms other AI detection tools is that the “AI algorithms at Originality.AI use natural language processing techniques that require a lot more compute power.” The same being the reason why their “free” tier is so stingy.
Pricing:
The tool is not very free-user friendly, offering only 3 free checks through the web app upon registration with blurred detailed results. The cheapest team-suitable plan starts at $19.95/month (or $14.95/month if paid annually) providing 1,000 credits per month (1 credit = 100 words scanned).
Accuracy test results:
Originality.AI’s performance was mixed in our tests:
While Originality.AI offers tons of blows and whistles for content management and claims high accuracy, the tests revealed some serious flaws in its AI detection capabilities and accuracy.
GPTZero is a leading AI content detection tool that stands out for its generous free plan and comprehensive features. It’s designed to work for both individual users and teams, offering a range of AI detection capabilities including pretty robust analytics.
How AI detection works in this tool:
According to the tool’s documentation, GPTZero uses two key components in its detection process: perplexity measurement and burstiness analysis (and we talked about both in the intro part of this report). The tool compares these metrics against benchmarks for human and AI-written text to make its determination. Very straightforward.
Pricing:
GPTZero offers a substantial free plan with up to 10,000 words per month and basic team collaboration features. Paid plans start at $10/month (billed annually) or $15/month, offering increased word limits and advanced features like plagiarism scanning. Team plans with multiple seats start at $24/month per seat.
Accuracy test results:
GPTZero demonstrated strong performance in our tests:
Overall, GPTZero proves to be the best in accuracy, though it may err on the side of caution when assessing human writing. Its accuracy and generous free tier make it a strong player in the AI detection tool market.
ZeroGPT is one of the OG of AI detection tools. Especially given that most of us can be more than happy with its free tier. It also offers a range of features beyond just AI detection including summarization, paraphrasing, and translation. It stands out for its free version and pay-as-you-go API option for businesses.
How AI detection works in this tool:
ZeroGPT uses what they call DeepAnalyse™ Technology, a multi-stage methodology that analyzes text from macro to micro levels. The tool implements deep learning based on extensive text collections including internet content, educational datasets, and proprietary synthetic AI datasets.
Pricing:
ZeroGPT offers a generous free plan with 15,000 characters per AI detection. The Pro version, priced at $7.99/month (billed annually) or $9.99/month (billed monthly), increases this limit to 100,000 characters. For businesses, there’s a pay-as-you-go API option starting at $0.034 per 1,000 words.
Accuracy test results:
Despite its sophisticated technology claims, ZeroGPT showed inconsistent performance in our tests:
Out of all the services on our best AI content detectors list, ZeroGPT has the most substantial free tier. However, our testing showed that its AI detection can be questionable.
Smodin is another popular AI content detection tool that comes with additional features like grammar checking and text improvement suggestions. It has a pretty goofy and outdated interface that still should work for a wide range of users, from students and educators to people writing for a living and businesses.
How AI detection works in this tool:
According to Smodin’s documentation, the tool uses advanced algorithms to analyze text for AI interventions. The tool examines factors such as language consistency, writing complexity, and factual errors to decide if any AI was involved in content creation. It claims to detect text generated by various AI models including ChatGPT, Gemini, and Azure.
Pricing:
Smodin offers a free plan with limited features (5,000 characters per text, 5 uses per week). The Essentials plan, priced at $12/month (annual billing) or $15/month (monthly billing), provides more advanced features like API access, higher character limits (15,000 per text), and unlimited daily uses.
Accuracy test results:
Smodin showed mixed results in our accuracy tests:
While Smodin offers promising features and claims high accuracy rates (91% for AI documents, 99% for human documents), according to the test results, it may struggle with nuanced cases involving partial AI involvement. Also, the fact that the free tier is so limited can scare off potential users.
Hive is an AI content detection tool that offers both text and image analysis capabilities. It claims to use advanced machine learning algorithms for accurate detection and classification of AI-generated content.
How AI detection works in this tool:
Hive uses a multi-step process for content analysis. For text, it uses feature extraction and pre-trained models to identify AI-generated content. The tool claims to perform both binary classification (AI vs. human) and source classification (identifying specific AI generators), providing confidence scores for each classification.
Pricing:
Hive’s pricing structure lacks transparency. You must personally contact the sales department to learn about your potential costs and no pricing data is publicly available, at the moment of working on this research. With a minimum of 750 characters and a maximum of 8,192 characters per try, the free tier only provides 2 text analyses.
Accuracy test results:
Our tests found serious flaws in Hive’s performance:
While Hive’s website looks very sleek and presents what’s going on under the tool’s hood no less than sophisticated, the tests’ results suggest its actual performance is a bit less than you would expect. Poor accuracy, limited free tier, and unclear pricing make it difficult to recommend this tool.
QuillBot AI Detector is a user-friendly tool designed to identify AI-generated content including text refined by paraphrasing tools or grammar checkers. It stands out for its VERY generous free plan and detailed analysis capabilities and also offers writing tools.
How AI detection works in this tool:
QuillBot’s AI detector uses advanced algorithms to analyze text for indicators of AI generation, such as repeated words, awkward phrasing, and unnatural flow. The tool provides a detailed report highlighting specific sections of text that appear to be AI-generated.
Pricing:
QuillBot offers a generous free plan, allowing users to scan unlimited texts of up to 1,200 words each at no cost. Premium is available, though it doesn’t seem to increase the word limit for AI detection. Team plans start at $3.75 a month per seat for 5-10 seats (billed annually).
Accuracy test results:
QuillBot AI Detector demonstrated impressive performance in our tests:
I love QuillBot AI Detector for its generous free tier and very decent accuracy. The unlimited free scans make it an excellent option for those looking for frequent AI content checks without robbing the bank.
While the world of AI detection tools is quite mysterious and complex, I hope that readers of Selzy blog can benefit from the findings in this report. To wrap it up, GPTZero and QuillBot offer the best value in regards to accuracy and price. Use these tools to spot-check your email content, if you’re experimenting with AI writing assistants, to make sure your messages still sound authentic.
AI in email marketing is popular and widespread. Still, the most effective emails probably come from a deep understanding of your audience — and that is likely a skill only accessible to humans. (Although it’s another data point we should test in one of the future articles).
All in all, no detector is perfect and even the best tools throw false positives every once in a while, so, don’t let all AI paranoia stay in the way of your creativity. Always focus on value-driven content that is helpful to your audience, AI or not.