The Struggle with AI Content Detectors

Hadi Fadlallah
Tech Blog
Published in
4 min readJun 3, 2023

We have to accept the fact that we have to spend much time convincing these detectors that we are humans. Instead of boosting our productivity, these technologies become frustrating.”

Photo by Siddhesh Mangela on Unsplash

These words, taken from one of my email conversations about AI content detectors, perfectly encapsulate the mounting frustrations experienced by many who rely on these AI-driven systems. As the influence of AI content detectors continues to grow, numerous academic publishers and online communities have begun requiring authors to subject their content to testing through these systems.

While these advanced language models have promised increased efficiency and streamlined processes, the prevalence of false positives and the resulting need to prove our humanity have become significant obstacles. Instead of simplifying our work, these technologies have turned into sources of frustration, hampering productivity rather than enhancing it.

These technologies are UNRELIABLE!

According to a recent research paper [1], the current AI-text detectors are not reliable in practical scenarios. The paper shows that a wide range of detectors, including watermarking-based and non-watermarking-based detectors, can be broken by simple practical attacks such as paraphrasing attacks. The false positive rate of the detector can be significant, and if it is not low enough, humans could get wrongly accused of plagiarism. Therefore, the practical applications of AI-text detectors can become unreliable and invalid. The paper highlights the sensitivities of a wide range of detectors to simple practical attacks and provides an impossibility result indicating that even the best-possible detector can only perform marginally better than a random classifier.

While introducing their classifier, OpenAI stated that:

Our classifier is not fully reliable. In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).

Benchmarking popular AI detectors

In this article, we will examine the effectiveness of several popular AI detectors by subjecting them to a human-written paragraph and an AI-generated text. Subsequently, we will analyze and discuss the outcomes obtained from these tests.

Input data

I requested ChatGPT to assist me in crafting an introduction for an article centered around data quality. Specifically, I asked for the use of the first-person pronoun “I” to indicate that within the article, I will be sharing some personal experiences. The resulting introduction provided by ChatGPT is as follows:

“… As a data enthusiast, I have witnessed firsthand the transformative power of accurate, reliable data, as well as the devastating consequences that poor data quality can bring. I have navigated through vast data landscapes, encountering both hidden treasures and treacherous pitfalls. Throughout my journey, I have come to appreciate the vital role that data quality plays in shaping the destiny of organizations, be they small startups or multinational corporations…”

On the other hand, I wrote the following paragraph and proofread it using Grammarly:

“Relying on insufficient or irrelevant facts leads to bad judgment, missed opportunities, and negative legal and financial repercussions. This is why businesses spend money on high-quality data. Accurate data helps businesses make wiser decisions and run their businesses more efficiently.”

This article is the first in a series, where we will explain the main concepts of data quality and how we can ensure data quality using Microsoft SQL Server.”

Tested detectors

We tested both paragraphs on eight popular AI content detectors listed below:

Results

Belows the experiment results are illustrated:

Experiment results

Based on the aforementioned findings, it is evident that four detectors categorized my text as “Likely human content,” while the remaining four identified it as AI-generated. Interestingly, none of the two detectors classified the AI-generated text as “Mixed,” with the remaining detectors categorizing it as human text.

Photo by Mohamed Nohassi on Unsplash

Conclusion

In conclusion, the findings from this small experiment underscore the unreliability of current AI detectors.

The fact that different detectors reached contradictory classifications for the same pieces of text is concerning. With four detectors identifying the human-written content as “Likely human,” while the remaining four labeled it as AI-generated, and none of the detectors recognizing the AI-generated text as “Mixed,” it becomes evident that these detectors lack consistent accuracy. Such inconsistencies highlight the limitations of current AI detection systems and that they could result in false positive or false negative classifications. Besides, it emphasizes the pressing need for further advancements in this field. It is imperative to address these challenges to ensure the development of more reliable and trustworthy AI detectors that can effectively differentiate between human and AI-generated content.

Fun fact 😎

During the writing process of this article, I employed ChatGPT to enhance its readability. Interestingly, none of the classifiers mentioned above detected any AI-generated content within this piece.

Keep calm and navigate the waves of technological progress wisely!

References

  • [1] Sadasivan, Vinu Sankar, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, and Soheil Feizi. “Can AI-Generated Text be Reliably Detected?.” arXiv preprint arXiv:2303.11156 (2023).

--

--

Hadi Fadlallah
Tech Blog

I write about data engineering, data management, SQL, and anything related to data. https://thedataengineer.blog