Series: Creator Tools for Writers

How Does ChatGPT Pick Its Answers?

A user’s understanding of the Large Language Model (LLM) behind OpenAI’s ChatGPT

Published in

ILLUMINATION

9 min readApr 14, 2023

ChatGPT is a game-changer for many professionals. And, despite some criticism about the quality of its older versions, the AI tool is raging on, reaching 100 million active users in Jan and launching the new & improved ChatGPT-4 in Mar.

I’ve had good results with the content writing features of the free versions, with only some inaccuracies. But this article was prompted by one of those inaccuracies.

Intrigued by the error, I looked into the following:

ChatGPT’s Decision-Making Process
Possible Reason(s) for The Incorrect Answer
ChatGPT’s Ability to Correct Itself (without my intervention)

Note: I used the free, less-powerful version of ChatGPT released on Mar 14.

Imagining LLM Decisions (Photo by DeepMind)

The Prompt

I prompted ChatGPT to decide on the greatest cricket “off-spinner” of all time. (In the sport of cricket, an off-spinner is one of the many types of bowlers (the baseball equivalent of pitchers)).

I had a fair idea of the usual suspects.

But, the AI threw a curve ball by nominating a different type of bowler, a “leg-spinner”, Shane Warne, as the best off-spinner of all time.

ChatGPT Q&A (Shane Warne) (Photo by Salman J)

ChatGPT’s Decision-Making Process

To understand why there was this wrong answer, let’s see how ChatGPT decides on an answer.

According to Marco Ramponi, AI Scientist and Educator at AssemblyAI, the basic algorithm of ChatGPT decides its response based on a calculated probability that those words and sentences are the best choices. In other words, it’s not focused on what you expect but on what “seems right” based on a probability analysis.

Teaching the AI Tool

There are 2 techniques for teaching the AI tool, explains Ramponi:

The model is given a sequence and asked to predict the next word in that sequence.
Some words in the AI sequence are masked, and the AI is asked to predict those words.

The Problem With This Method

These techniques create an “ALIGNMENT PROBLEM” (which is corrected by human intervention, explained later in the article).

The problem with a model designed to predict the next word is that it may not be able to learn at a higher level, says Ramponi.

For example, it does not know that it’s making a “big error” or a “small error” when it says that Shane Warne is just a leg-spinner and not a leg- and off-spinner.

Factors Affecting the Quality of the ChatGPT Answer

With that understanding of ChatGPT, let’s see what factors could have produced the inaccurate answer.

General Factors: There are 4 general factors affecting the quality of ChatGPT’s output, says Ramponi:

The model was trained on biased/toxic data and it reproduced that, even if you didn’t ask it to do so
It’s hard to understand how the model made its decision
The model “hallucinates” (makes up facts)
The language model does not follow the instructions you give

The 3-Step “Human Touch” for Tackling The Alignment Problem

To resolve this issue, ChatGPT uses human feedback in 3 steps, explains Ramponi:

Step #1: Humans write the expected response for a list of prompts.

For ChatGPT3, some prompts come from users and some from labelers or developers. This model is called the Supervised Fine-Tuning (SFT) model.

Step #2: The SFT model is asked to generate multiple outputs for a large selection of prompts.

Labelers then judge the quality of those outputs from good to bad. A new model, the Reward Model (RM), is trained using the data.

Step #3: The reward model is used to further refine the output of the SFT model.

The step involves fine-tuning the SFT with Proximal Policy Optimization (PPO). To respond to inputs, PPO estimates the response based on the RW and compares the quality of the usefulness of the new response it will give to the older responses it gave. PPO learns from mistakes and updates directly.

Possible Reason(s) for The Inaccurate Answer to My Question

Human Factors

Based on how ChatGPT works, I assume human training played a role in the answer I got.

But, I don’t know the details of the SFT, RM, and PPO inputs/outputs. I don’t know how much data related to the type of questions I asked was used in the training. And, why the AI tool picked a less-likely answer.

So, unfortunately, I have to leave these human-touch factors out of my explanation for the inaccurate answer.

The 4 General Factors

This leaves us with the 4 general factors. Let’s look at these one by one:

Factor #1: Toxic/Biased Data
I thought toxic or biased data may be one of the main factors. Here’s why:

Before I asked ChatGPT the Shane Warne question, I asked it another question in the same chat.

I asked ChatGPT to “act as leg-spinner Muttiah Muralitharan bowling an over to Brian Lara.” (Muralitharan is not a leg-spinner (he’s one of the top off-spinners of all time.))

ChatGPT knew about Muralitharan as the Sri Lankan spin-bowling great, but didn’t mention him as an off-spinner and bowled him as a leg-spinner, fulfilling my request.

ChatGPT Cricket Q&A (Muralitharan) (Photo by Salman J)

I thought this question may have led ChatGPT to equate off-spin with leg-spin (for me) and affected its response to the next (Shane Warne) question I asked.

But, to be more sure about this inference, I decided to Google Shane Warne.

I found on his Wikipedia page that he actually did bowl a “mix of off-spin and leg-spin,” as the AI tool mentions. But, this was well before he represented Australia as a leg spinner.

So, ChatGPT may have classified Warne as an off-and-leg spinner based on this info.

PROBABILITY: I’d give toxic data a low score of 2/10 for corrupting ChatGPT’s response because, although the language model is supposed to learn from my inputs, the info about his off-spin bowling was available on Wiki.

Factor #2: Difficult to Understand ChatGPT’s Reasoning
In the incorrect answer to the Shane Warne question, ChatGPT not only mentions him as an off-spinner but also says that he has taken “129 wickets” as an off-spinner, “the most by any bowler in the world.”

This very-specific fact led me to Google the info.

ChatGPT Cricket Q&A (129 Wickets) (Photo by Salman J)

Turns out 129 is an important number in connection to Warne:

Warne took 129 wickets in the 22 matches he played in England
Warne was instrumental in bowling England out for 129 in a memorable game between England and Australia
Warne got Andrew Strauss out on 129 runs in a spell where he took 6 wickets for 122 runs

BUT none of these stats explain why ChatGPT said Warne is an off-spinner with 129 wickets to his credit.

So, what could be the reason for this inaccuracy?

The second explanation is that ChatGPT either incorrectly interpreted facts given online or interpreted them in a way humans can’t explain.

How? Let’s see:

I found some web pages that mention all the key info ChatGPT provided: Shane Warne, 129 wickets, the most wickets in the world, and an off-spinner.

These webpages mention Nathan Lyon (off-spinner) breaking Shane Warne’s record of most wickets in Asia by a non-Asian bowler. The articles mention that Lyon took 129 wickets and Warne 127.

So, did ChatGPT simply misinterpret the information given in these articles, thinking that it was Shane Warne who took the 129 wickets as an off-spinner?

It seems likely.

But, based on my other experiences of using the AI tool, I wouldn’t expect it to make such a simple mistake.

So, assuming there isn’t a website explicitly mentioning the nonsense of off-spinner Shane Warne leading the world by taking 129 wickets, the answer may be that we don’t understand how ChatGPT came up with this answer.

PROBABILITY: A 7/10 because the method for calculating the answer is unclear.

Factor #3: Hallucination
When it comes to large language models (LLMs), a hallucination means mistakes that are “semantically or syntactically plausible but are in fact incorrect or nonsensical,” reports Craig Smith of the Eye on A.I. podcast. In other words, the AI tool makes up things that may pass as the truth.

In our example, calling Warne an off-spinner is not a plausible thing at all. But, we know that without the steps of SFT, RM, and PPO, the language model can have trouble differentiating b/w small and large errors.

So, it may be possible that ChatGPT just made this up. But, based on the other accurate details it gave in the Shane Warne answer, the specific record of 129 wickets as an off-spinner, and other responses to my cricketing questions (mentioned below), I find it hard to believe that hallucination was at play here.

Of course, as mentioned, another important factor here would be the user data and inputs by the labelers and developers to counter errors. Since we don’t know if the input contained these types of cricketing questions, it’s difficult to say if the 129 wickets were made-up facts or misunderstandings.

PROBABILITY: 3/10 because the information is too specific and hard to pass as truth.

Factor #4 ChatGPT didn’t follow instructions
This one seems least likely of the 4 reasons because of ChatGPT:

Gave an answer that was on the topic of the best off-spinner
Identified one bowler as the best spinner
Gave correct (and incorrect) data to back its selection

PROBABILITY: 2/10 because the answer contained info related to the topic of the question

ChatGPT’s Ability to Correct Itself

ChatGPT can make corrections mid-answer!

This one really caught me by surprise. I asked about the greatest fast bowler of all time, and this time I didn’t even use any toxic data beforehand.

Again, the answer was a BIG mistake. It selected Viv Richards, one of the best batsmen, as the best bowler of all time, but realized its mistake and corrected it in the next sentence!

ChatGPT Cricket Q&A (Fast Bowler Correction) (Photo by Salman J)

How did the AI tool make this correction?

We know that:

The AI model is based on calculating the probability of what should be written
In the optimization step, the model calculates the quality of new responses by comparing them to old responses

I would imagine that by generating the response, ChatGPT realized the inaccuracy in its previous response and corrected it. This was very good, but it also solidified my thinking about factor #2 (Difficult to understand ChatGPT’s reasoning) being responsible for the inaccuracy in the Shane Warne answer.

Intrigued by the language model’s ability to make corrections fast, I asked the same spin bowler-related questions in a different way.

I asked the trick question: “Is Shane Warne the greatest off-spinner of all time?

This time, ChatGPT made a correction. It said that Shane Warne is considered the greatest leg-spinner and not a great off-spinner.

ChatGPT Cricket Q&A (Shane Warne Correction) (Photo by Salman J)

Before this question, I also asked for a list of the best off-spinners. And, this time it not only didn’t place Shane Warne at the top of the pile but also excluded him from the list of the 7 best off-spinners.

(In case you’re wondering, here’s ChatGPT’s top 7: Muralitharan, Saqlain, Laker, Harbajhan, Swann, Ashwin, and Lyon).

At no point in this chat did I ask GPT to correct itself. The language model made all the corrections itself.

So, what happened here?

Did the AI tool find out it was wrong? Or, does it consider both the new and old answers to be equally correct? Was it the way I phrased the questions?

What appears to me from this little experiment is that ChatGPT improved the more it chatted about the topic.

(This could be because it revisited its previous responses in the PPO optimization step and/or used a new source when asked a new question).

I would love to hear your take on ChatGPT’s powers and limitations. Share your experiences with me in the comments.