Two things ChatGPT can get wrong: math and facts. Why?

How is one of the biggest AI Tools in the world making mistakes that even a normal person can figure out?

Sunny K. Tuladhar
4 min readFeb 3, 2024
DALLE-3: Prompt: ChatGPT is a cute confused robot (Image by Author)

ChatGPT 3.5, and Large Language Models(LLMs) in general, are unlike anything we have ever seen. Why do they make these seemingly silly mistakes? Here we talk about ChatGPT getting two things wrong: the math and the facts.

Math

ChatGPT can get basic arithmetic wrong when the sequence is long enough. This is unprecedented. No computer in the world has ever gotten arithmetic wrong, no matter how long the equation. Our everyday calculator, the most basic form of a computer, will never get math wrong. So what is going on here?

The actual answer for this is 14 but ChatGPT says 18. Try this out with a different long sequence of numbers. (Image by Author)

ChatGPT is not our everyday AI tool. It was designed to tackle one of the most difficult problems for AI, language. Language has always eluded AI because of its irrational nature. The past tense of walk is walked, but go is went? ChatGPT was trained on so much language data that it was able to overcome all of this linguistic madness to give us coherent answers in proper English. ChatGPT was designed for language, period. When you introduce this Language model to math, problems arise.

There is no inherent “math logic” embedded into the fundamental model of ChatGPT (although plugins might help). The model does not actually add numbers together when we write the above symbols. It has an idea of what math is. From all the corpus it has read, it “speculates” the answer. However, it does not do actual arithmetic operations behind the scenes when we type the symbols ‘+’ or ‘-’.

So next time you are dealing with numbers and numerical values with ChatGPT don't be surprised if it gets them wrong sometimes. Try to use other AI tools for maths. Also, read this if you want an in-depth explanation of this ChatGPT math mistake.

All this being said, this highlights a bigger underlying issue with ChatGPT and how it is being used today. ChatGPT functions by predicting the logical next word in a sentence. It is not bound by the traditional concept of computers using logic. Computers giving wrong output despite being given the right input is quite new.

Facts

ChatGPT has been notorious for making up facts while it answers your questions. This has been termed hallucination and it has always been a common problem in using AI in natural language. According to OpenAI’s CTO Mira Murati’s interview with Time magazine, “ChatGPT generates its responses by predicting the logical next word in a sentence, but what’s logical to the bot may not always be accurate”. To put it poorly, ChatGPT is a really powerful autocomplete (It is a lot more than that, but just an analogy)

ChatGPT does not care about facts. Unlike Google, it does not search the internet for the information you asked for and return the answer. It gives you back the answer that “seems” sensible enough, based on its training data and the context it has seen. This is most visibly seen when it quotes fake research papers from academia.

Tweet by David Smerdon on a fake paper shown by ChatGPT when asked “What is the most cited economics paper of all time”. The paper ChatGPT is mentioning does not exist.

This is not limited to ChatGPT. Google’s Bard also made a factual error on its launch day which was quite unfortunate. It caused Google’s shares to lose $100 billion in value. Bard claimed that the James Webb Space Telescope took the very first image of a planet outside our solar system which was incorrect.

So what can be done about it? First of all, avoid ChatGPT for fact-based information. Google is always more reliable for this. Second, you can try this handy prompt I got from Vanderbilt University’s course on Prompt Engineering for ChatGPT. It is called the Fact-list Pattern. An example is in the image shown below.

Fact-List Pattern: Whenever you output text, generate a set of facts that are contained in the output. The set of facts should be inserted at the end of the output. The set of facts should be the fundamental facts that could undermine the veracity of the output if any of them are incorrect.

Sample of the fact-list pattern. The pattern does not change the answer. It only makes ChatGPT list out the facts it used in the answer which we can check for credibility. (Image by Author)

You can also try Perplexity.ai a new AI-powered answer engine (as opposed to a search engine) backed up by notable investors including Nvdia and Jeff Bezos. It is designed not to hallucinate and gives all the references for its answers.

Conclusion

Given these limitations, ChatGPT is still an extremely powerful tool. If used well it provides very useful output and saves a lot of our time. A general recommendation would be to use ChatGPT for closed-world problems where you know all the facts and can verify the results. Two of my personal use cases for ChatGPT are writing and coding. Both of these results can be verified with little effort by the user. We will discuss in a future article why it is best for these two use cases. Until then use ChatGPT responsibly.

--

--

Sunny K. Tuladhar

AI Engineer✨, Musician🎸, Mechanical Engineer⚙️ ML/AI, Data, CAD, Music, Movies, videogames