Transformers (the big “T” in ChatGPT)

And how to explain them to a 9-year-old

Nuwan I. Senaratna
On Technology
4 min readMar 31, 2023

--

👶 This article is not exclusively meant for 9-year-olds. Anyone who wants to learn more about Artificial Intelligence, but whose knowledge of AI is equivalent to that of a 9-year-old (or thereabouts), might find this useful.

👴 This article assumes some simplifications. For a more “adult” introduction to Transformers, refer to the Wikipedia article on GPT.

You might have heard of ChatGPT, a chat bot that is very good at answering questions. The big “T” in GPT stands for “Transformer”, and is why GPTs are so good at what they do.

A transformer is a computer program that answers questions. It is especially good at what it does, because it has certain super-powers.

Let’s see what these are…

Paying Attention

When you here a question like, “What is the capital of India?”, you probably pay more attention to “India”, because it is important. If you change “India” to “Pakistan” the answer to the question changes (From “New Delhi” to “Islamabad”).

Like you, Transformers are also able to focus on or pay attention to the most important parts of questions.

Image Credit: DALL.E 2

Reading Multiple Words

When you first learnt to read, you read single words at a time; like “what”, “is”, “the”, “capital” etc.

Older computer programs were a bit like this. They read one word at a time. And so their reading was quite slow. And sometimes they had to re-read what they had already read, to make sure that they understood correctly.

However, as you get better at reading, you can read multiple words at one go. For example, you might read “What is the” and then “Capital of India”.

Transformers are also like this. They can read multiple words at the same time. And so, they are much faster than older computer programs.

Image Credit: DALL.E 2

Many Friends

Suppose I ask you, “What is the capital of the country which has a flag with a white cross on a red background?”

Now you know all the capitals of countries, but you don’t know a lot about flags. But no problem, because you have a friend who is an expert on flags. So, you ask her, and she says, “It’s Switzerland. Switzerland’s flag has a white cross on a red background.” You know the capital of Switzerland, and so you answer “Bern”.

Now, you knew about capitals, and your friend knew about flags. And together you were able to answer my question.

Transformers are also a bit like that. They have many “friend” programs (sometimes called “layers”) that learn to be good at doing different things. Together they can answer quite complex questions.

Image Credit: DALL.E 2

Lots of Practice

With anything, practice makes you better, whether it is a game or sport, or a subject at school, or learning to play a musical instrument.

And Transformers practice a lot. They learn to answer questions by studying billions of questions and their correct answer. Often they practice for weeks, or even months; non-stop.

Image Credit: DALL.E 2

Good at Guessing

Suppose I ask you, “What is the capital of Ja..”, and before I can finish you say “Tokyo”.

That’s because you are good at guessing. I might have wanted to say “Jamaica”, but there is quite a good chance that I wanted to say “Japan”.

Transformers are also very good at guessing. Hence, they can answer questions that they are not 100% sure about, quite well.

Usually, guessing is not a bad thing, but sometimes Transformers can guess completely wrong answers. So, we should be careful when we are using computer programs or apps that use Transformers. They are not always right.

Image Credit: DALL.E 2

--

--

Nuwan I. Senaratna
On Technology

I am a Computer Scientist and Musician by training. A writer with interests in Philosophy, Economics, Technology, Politics, Business, the Arts and Fiction.