AI is getting worse in 2023?

4 min readJul 26, 2023

One of the fundamentals of AI is that “intelligent algorithms” get better with time because they learn the more data they process. This implies that Generative AI with its massive growth since the end of 2022 should be MUCH better after less than one year.

Well, it hasn’t happened yet.

Users and researchers are finding that GPT-4 in June 2023 was worse than a few months before. According to researchers from Stanford and Berkley there are many instances where AI gives wrong responses or fails to follow the prompts.

In the paper there are many examples where the team analyzed the different response to the same prompts in march and then again in june.

In one case the prompt was “Is 17077 a prime number? Think step by step.”
In March GPT-4 would follow the chain-of-thought exposing the reasoning and say “yes” (correct) while GPT-3.5 would say no (incorrect) and then explain why. In June 2023 GPT-4 cut it short, giving immediately the wrong answer “no” ignoring the request of showing the chain of thought, while GPT-3–5 followed the correct process and gave the correct answer.

The team using a data set of 500 math problems in March 2023 got 488 correct answers out of 500 on GPT-4, in June only 12.

There are also other examples in the paper related to worse performances in terms of generating executable code both on GPT-3.5 and 4.

Research papers and complaints on social media

And if you think that is just some researchers “trying hard” to make ChatGPT underperform you haven’t been on social media lately. Thousands of users have been reporting problems with GPT 4 and 3.5 providing enough anecdotal evidence to support many more research papers to understand better what’s going on with AI.

On Twitter there are hundreds of conversations where AI enthusiasts and GPT early adopters describe the problems that are having lately. Tendencies to ignore part of entire prompts and just giving an answer, incorrect information, not giving enough rationale to explain an answer and in some cases users complained about “totally irrational behavior” and “hallucinating conversations”.

OpenAI had to step in the conversation with a Tweet from VP Product Peter Welinder that states:

No, we haven’t made GPT-4 dumber. Quite the opposite: we make each new version smarter than the previous one.
Current hypothesis: When you use it more heavily, you start noticing issues you didn’t see before.

The tweet had many responses of users reporting problem, speculations about why this happens are many: someone things that Reinforced Learning from Human Feedback is the cause, others speculate that instead of having one almighty language models that knows it all, OpenAI is building a lot of little models with specific expertise. This approach is called Mixture of Experts (MOE) and reduces computational costs.

AI and the MAD cow disease

Once again we want to reassure everyone that this is not the first sign of a robot rebellion.

The truth is that, as we always said AI wasn’t going to replace humans anytime soon, at least not Large Language Models.

Another research team wrote a paper with an interesting title: Self-Consuming Generative Models Go MAD. In this study they coined the term MAD as an acronym for Model Autophagy Disorder, in an analogy with the Mad Cow disease, this disorder happens in LLM when they “eat their byproducts” meaning that they get trained on AI generated content. So researchers speculate that one of the reasons for AI getting dumber is the training on synthetic data.

The tendency to use AI Generated data to train LLM exists and has the benefits of making the training faster and cheaper (especially reducing to 0 potential copyright and privacy issues and lawsuits), therefore bringing to the market new products faster and with higher margins.

This comes at a cost: AI “knowledge and experience” don’t open up, it actually tends to become a closed loop, making the LLM inefficient.

What is the solution? Incorporating more human generated content!

In one of our previous articles we said that AI writing needs human writers and that this is the philosophy that made us create WriterGenie: a few weeks after that article the hype for AI seems to be slightly fading (traffic on ChatGPT decreased by almost 10% in June) and it feels like we are facing reality for the first time.

What is this reality?

To be honest it’s time to admit that the time of impressive demos it’s ending and is the moment to focus on a real problem: cost and efficiency. While everyone out there wants to develop their AI powered apps and softwares using API’s and plugins, companies are struggling with providing affordable solutions for developers that guarantee scalability and margins.

The main problem with AI is a business problem.

WriterGenie has been structured to enhance human writers skills and support their work, not heavily depending on AI for the entire output. Our AI writing tool maximizes time and work, making writing 10x faster but enabling talented and skilled humans to unleash their creativity without fearing writer’s block and long research times. We want to save you boring busywork in creating slight variations of the same content and optimizing it for readers and for SEO.

All of this while the writer is always in control.

This article was written with the assistance of WriterGenie.ai, the frustration-free AI designed to empower writers to produce original, quality content at scale.

AI is getting worse in 2023?

Well, it hasn’t happened yet.

Research papers and complaints on social media

AI and the MAD cow disease

What is this reality?

Written by WriterGenie