OpenAI Launches Its AI Content Detector — The End of AI Copy/Paste Plagiarism Practices?

Published in

Marketer’s Guide to the AI Galaxy

5 min readJun 15, 2023

Ever since the launch of ChatGPT, there has been a growing concern over AI-plagiarized content, especially in the academic field. In fact, ChatGPT has made highlights for being able to pass the US Medical Licensing Exam with 60% accuracy.

To detect AI-generated content, many services launched AI plagiarism detectors recently. OpenAI joined the party late but recently introduced its own “AI Text Classifier” to detect AI-generated text.

As a professional content writer, I am curious to know how effective the OpenAI AI content detector is, especially considering that it is the builder of ChatGPT. In this blog, we will test out the OpenAI AI Classifier based on different scenarios and see its usefulness compared to other options.

What OpenAI Says about its AI Classifier

Before we head to our experiment, it is important that we know the features and limitations of the OpenAI AI Classifier.

According to OpenAI, the AI classifier is trained to distinguish between human-written and AI-written text from various AI tools, not just ChatGPT. The strange thing is that OpenAI itself says that its classifier is not fully reliable. It even validates the point with its own tests. The classifier was able to detect 26% of AI-written text correctly but incorrectly labeled human-written content as AI 9% of the time.

At first glimpse, the OpenAI AI classifier is not making us set good expectations from the tool. So, let’s test the classifier ourselves and see if OpenAI claims are correct or even worse.

Experiment #1: Testing OpenAI AI Classifier with a Complete ChatGPT Written Content

I started the experiment by testing the classifier with a short blog I created with ChatGPT. I asked ChatGPT to create a short blog post on “5 Best Marketing Strategies for Business in 2023”.

Afterward, I copied and pasted the whole content into OpenAI AI Classifier. Once I clicked the “Submit” button, the AI Classifier responded, “The classifier considers the text to be unclear if it is AI-generated”.

So, that’s a complete failure. It should have labeled it “very likely” or “likely” AI-generated content.

Experiment #2: Testing OpenAI AI Classifier with a Complete Human-Written Content

In the second test, I tested the classifier with complete human-written content. I recently wrote 1300 words blog post titled: “A Comprehensive Guide on Ransomware”, which had zero input from any AI tool.

So, when I asked the OpenAI AI Classifier to test this content, it responded, “The classifier considers the text to be unlikely AI-generated”.

Well, that’s an accurate result. But this makes me think whether it can provide the same accurate result for a less word count article. Therefore, I asked it to scan my 500 words blog on “Mad Sad Glad Retrospective Exercise — Explained Easy” and it responded, “The classifier considers the text to be unlikely AI-generated”.

Shocking, but it gave the correct response once again. So, we can assume that it can manage to deliver correct responses for human-written text no matter the word count.

Experiment #3: Testing OpenAI AI Classifier with a Mix of AI-Written and Human-Written Content

Now I tested how the OpenAI AI Classifier responds to content involving both AI-written and human-written text. I wrote a blog post on “How to Protect Your Business from High Inflation”, using approximately 50% AI-written and 50% human-written content.

When I passed the blog post with AI Classifier, it responded, “The classifier considers the text to be very unlikely AI-generated”.

A big FAILURE! It failed to detect the presence of AI content and even gave a higher AI-free rating compared to what we got while testing just human-written content.

How is OpenAI AI Classifier Effective Compared to Other AI Detectors Available in the Market?

There is a perception that it is not easy to detect AI content, and that’s why users can experience incorrect results with OpenAI AI Classifier. So, let’s validate this point with other available AI detectors and see what results we get.

I used Copyleaks AI Content Detector and found the below results for the same three experiments:

Experiment 1: 92.8% AI & 7.2% Human

Experiment 2a: 27% AI & 73% Human

Experiment 2b: 99.8% Human

Experiment 3: 44.6% AI & 55.4% Human

Looking at the above results of Copyleaks, one thing is evident that it provides more accurate results compared to OpenAI AI Classifier. So this also eradicates the claim that it is hard to detect AI content.

Overall, after doing different experiments on OpenAI AI Classifier and comparing it with the Copyleaks AI detector, it clearly reflects that the OpenAI AI detector is not a game-changer. It still needs a lot of improvement to show more trustworthy results.

Ranking:

Accuracy: 2/5 — It fails to detect AI content even if it is written from its own ChatGPT software.
Ease of Use: 4/5 — Easy, click-based online platform.
Customization: 1/5 — Not customizable. It only provides a basic web interface to input text, scan, and get results.
Scalability: 4/5 — Can handle lengthy text without any specific word count limit.