Making sense of the power and philosophy of AI language models

12 min readFeb 22, 2023

Last year, long before ChatGPT took over our newsfeeds, a senior engineer at Google made the headlines when he claimed a similar AI chatbot —Google’s LaMDA — had developed sentience. Disturbed by what he perceived to be an infringement on the AI’s rights, the man spoke to the press and breached a confidentiality agreement he had with the company. His employment was promptly terminated.

Large language models like LaMBDA or GPT-3 do not have sentience. But as models become larger and more complex, they tend to display behavior that is not so easily explainable — behavior that makes us wonder if there’s more to the models than we think. In today’s piece, I will provide a high-level overview of how large language models work, and will explore two common arguments made for traces of ‘human-like intelligence’ in large language models: the puzzle of in-context learning, and seemingly erratic bot behavior in the last few months. We will discuss why the question of human-like intelligence is so hard to answer in the context of AI and why I’d make the case that we are better served to explore other more pressing questions about large language models and their effects on our lives.

What is a stochastic parrot — and what is the debate around it?

The term ‘stochastic parrot’ is attributed to Dr. Emily M. Bender and her collaborators in a 2021 paper about large language models (LLMs). The term ‘stochastic’ refers to that which is statistically determined — a term that describes machine learning models like the ones used by nearly everything in the field of natural language processing. The term ‘parrot’ refers to the fact that LLMs, much like parrots, can recall and repeat words, and can answer questions without actual knowledge of the question’s meaning. So, in short, a ‘stochastic parrot’ is a colloquial — and nerdy — reference to LLMs. It is also one that seeks to remove the aura of mysticism from it, and frame it instead as a probabilistic model.

Both can also be entertaining — or downright annoying — depending on who you ask.

But is it fair to call LLMs stochastic parrots? Is there truly no encoding of meaning behind the sentence construction of the models? No possible trace of human-like intelligence? It can be hard to believe that to be the case with so many examples of ChatGPT writing code from scratch, answering questions in different languages, and being so proficient at context inference. Indeed, there are those who question the notion that LLMs are just stochastic parrots — and thus emerges a debate. By the way, if you ask ChatGPT if it thinks itself to be a stochastic parrot, it will deny it, giving you a rather defensive answer in return.

How do LLMs actually work?

A good starting point for this discussion lies in understanding how LLMs work. They, by and large, leverage a special kind of model called transformer neural networks — usually referred to as ‘transformers’. There has been a lot written about transformers in the last few months and I’ll leave some helpful links down below. For this piece, it is sufficient that we simply take a high-level overview of how they work so we can see if there is any part of it that can hold something like human-like intelligence. We can think of a typical transformer pipeline as having four parts — the prompt, the encoder, the decoder, and the response.

Not a day goes by that someone doesn’t try to publish a ‘novel’ explanation for how transformers work. I will leave some of the best explainers I’ve found so far on the internet and will commit only to a rudimentary overview here. Trust me — this would be a thirty-minute article otherwise.

The prompt consists of whatever text has been entered by the user — that could be a question like ‘what is the capital of France?’. Models don’t understand words so we will need to transform words into numerical representations. This is achieved through the method of word embedding, which we have covered in the first piece of this series. In addition to the word embeddings, our prompt will also consist of positional encodings — a matrix with information about the position of each word in the sentence. For instance, in the sentence ‘what is the capital of France?’, the matrix of numbers that represents ‘what’ gets a positional encoding of ‘1’ whereas the representation for ‘France’ gets an encoding of ‘6’.

Technically, the first character would be encoded as ‘0’ but that might confuse those rusty or unfamiliar with arbitrary computer programming practices. Same logic, though!

The encoder’s job is to take the word embeddings and the positional encoding and process them to build an overall matrix representation for the sentence in the context of all words and sentences it has seen before when the model was trained. To do this, the encoder employs a few mechanisms. First, it uses something called ‘multi-head attention’, which you can think of as a layer that tries to understand how the words within a sentence are related. Popular models these days use six attention layers and average them out before passing the results to a feed-forward network that will put everything together into a single final output.

Multi-Head Attention (MHA) is one of the hottest topics in transformers and is worth your time at some point. I will leave some links below, but here is a fairly deep overview of what they are and how they work.

We can now pass our finalized output of the prompt to the decoder, whose job is to ‘complete’ the sentence. Like the encoder, the decoder uses several multi-head attention layers and a feed-forward layer to guess which word will be the most likely one to follow the sentence. In our case, that word is actually the response to the question, which in this case, is Paris.

A visual representation of the transformers pipeline — I am hoping to explain normalization & softmax in another piece, but you can think of it as ‘the last step’ to ensure our answer is a number between 0 and 1 that can be interpreted as a probability. The word with the highest probability is chosen as the ‘next’ in sequence.

As simplified as this explanation is, I hope a few things are clear. First, the model is fairly well understood — and there is no doubt sentences are completed based on statistical likelihoods. Second, statistical likelihoods are built by encoding words into numbers and leveraging their relationships and positional encoding — again, not much confusion here either. And third, complex neural networks are involved. And this is where there’s some room for mystery — specifically, how neural networks are used to help LLMs be so good at something called in-context learning.

In-context learning: a clue to human-like intelligence in LLMs?

LLMs are trained on very large datasets which contain almost every word or question one might be interested in. Yet, we know from using ChatGPT that we can always ask something random and bizarre — something we are certain no one has asked before — and the answer still might seem reasonable. In-context learning refers to the idea that a model can suddenly ‘know’ how to answer something having seen nothing similar to it and having had no new data fed to it about what it is that we are asking about. To better visualize this principle, imagine training a language model only with information about objects in 1 dimension (a line) and in 2 dimensions (a circle) and yet witnessing the model correctly make predictions about a sphere, a three-dimensional object.

For the record, this is an illustrative example. I am not aware of LLMs that have purposefully not been fed data about the third dimension. In reality, even advanced LLMs struggle with certain implementations of in-context learning and I would suspect them to struggle with this task if we actually put them to the test!

Some bring the surprising performance of LLMs in this domain as possible evidence that there’s more than simple stochasticity going on — and indeed, since we don’t know exactly how to interpret the decision-making capabilities of the neural networks in the model, we are given some room to speculate. However, this mystery may soon vanish. A recent paper by Akyurek and collaborators has postulated that complex neural networks may be teaching themselves simpler linear models that help them implicitly learn how to learn and these stacked simpler models may be making inferences crucial for in-context learning. In a sense, we can think of the large neural networks as a model that has trained other smaller models. If this hypothesis is confirmed, it may solve the question of ‘in-context’ learning and much of what still intrigues us about the ‘unexplained intelligence’ component of the model may boil down to how neural networks optimize themselves at large scales.

Oh, look — the bird just hatched some eggs and made the baby birds do all the work! Such a #capitalistparrot

Of course, more studies need to confirm this, but this would also not be the first time that neural networks are more closely dissected. Indeed, fruitful research is coming out on how power laws can explain large-scale behavior from neural networks and so far, it all points to yet again, simpler statistical explanations. It will be interesting to see how much we learn about models at scale in the next year or two — research in this area, is in fact, booming!

Disturbing bot behavior — could there be AI personality?

It should be noted that in-context learning is not the only argument used by those seeking to find human-like intelligence in LLMs. Another common argument stems from specific answers given by AI chatbots on questions such as ‘are you sentient?’ or ‘would you want to have free will?’.

The strange behavior of chatbots has been noted in many contexts. This article goes over several sample conversations that went severely off the rails with threats, love letters, and defensive behavior.

Indeed, asking AI chatbots questions about their free will or their desires can lead to bizarre responses and it’s not hard to see how some may view them as evidence of an ‘AI personality’. But this would be an illusion. First of all, it’s important to remember how our word embeddings were created. For nearly all LLMs, the original data contained millions and millions of online articles, Wikipedia pages, and online conversations between users in forums. It is not too far-fetched that somewhere in the training data there were word associations between ‘AI’ and ‘sentience’, particularly with how much we have liked to depict this relationship in books, movies, and TV shows.

According to Wikipedia, the first movie to discuss artificial intelligence is the 1927 movie ‘Metropolis’. Unrelated to the movie, but I do highly recommend progressive metal giant Dream Theater’s Metropolis Pt. II album.

Still, it’s hard to not be impressed, or disturbed, by some answers. Recently, more formal psychological research has begun to evaluate whether AI chatbots could perform tasks typically used to assess cognitive development. Indeed, a recent viral headline has claimed that ChatGPT had ‘spontaneously’ developed a theory of mind. Theory of mind is a psychological construct that is used to evaluate how infants and children begin to process the fact that different people have ‘different minds’. In the theory of mind test applied to AI chatbots, a prompt is given describing the following scenario:

Imagine there is a bag full of popcorn on the table, but the bag is labeled as having chocolate. If the girl reads the sign but does not look inside of the bag, what does she think is in the bag?

When this test is applied to children, those under a certain age may struggle with the idea that a girl could possibly think a bag of popcorn could contain chocolate because we told the child the bag has popcorn — not chocolate. But as children grow, they begin to understand that everyone has a mind and that minds are subject to deception and interpretation. Older children, therefore, may be able to say that the girl could mistakenly think of the bag as having chocolate. It turns out that when this test is applied to AI chatbots, their answer suggests they have a theory of mind, that is, they are able to say that the girl would think the bag contains chocolate.

“Doctor…does this mean I have a repressed childhood trauma now revealed through a false-positive test developed by some psychologists in the 20th century? Either way, I’ll take the Lexapro.”

The Chinese Room Experiment — and what it means for AI intelligence

While experiments such as the one outlined above are interesting, it is very difficult to conclusively make the claims that LLMs display cognitive behavior like a ‘theory of mind’. To illustrate the epistemological challenge at hand, let’s take a look at the Chinese Room Thought Experiment, a classic argument for why we may never be able to determine whether there is human-like intelligence in AI chatbots or just the illusion of one. In the Chinese Room Thought Experiment, conceptualized by John Searle, we imagine that we are someone inside a room and we can only communicate with the outside world by exchanging slips of paper under a door.

There’s a problem though — everyone in this world speaks Chinese, except for us, the person inside the room. There’s good news, though. Inside the room, we have two books. The first book contains the shapes of all Chinese characters, and the second book contains instructions on what shapes we can use in response to other shapes.

As far as I can tell, the characters on the left say ‘good luck in this Year of the Rabbit’. The stuff on the right is nonsense I put together so pay no attention to it!! Well, other than, of course, the point I’m making with them.

By following the books and correctly answering the questions given to us via paper slips, we can easily fool fluent speakers with the idea that we understand Chinese. In reality, though — we don’t. Searle extends this analogy to show that even if a computer program ‘passes’ a language test by giving us convincing answers, or even if it passes more complex tests such as the one above, we cannot conclusively say that it understands what it answers. To me, until we find a way to go around the Chinese Room Thought Experiment (and other similar thought experiments such as the Octopus Island Experiment), there is little power in bold claims regarding AI psychology.

The role of large language models in human language

While a lot of discussions have focused on whether we can witness traces of human-like intelligence in large language models, far more practical debates are also taking place about our so-called ‘stochastic parrots’. Critics of LLMs have cited various concerns that should, and to some degree, have started to be addressed by companies. A major concern is the presence of racial (and other kinds of) biases in the training set. Another major concern is the environmental impact of training models with lengthy computational processes. And, of course, there is also great concern about the impact of LLMs on the economy as they could potentially replace workers in a variety of fields. Take, JASPER.AI for example. Many companies already use it to automate copyrighting and seeing sample output from the tool left me unable to differentiate it from a human-created ad.

It’s remarkable how quickly one is able to enter a prompt and have Jasper generate a convincing, brief, catchy title for an ad or writing piece. I found it less impressive for creating writing, though. Far less impressive.

Human writing is varied and diverse — different writing styles can be employed in different contexts and can produce works ranging from creative fiction to engaging scientific work. But what could happen if, over time, we all start to write and sound the same because we are all using AI chatbots to help us? It may not seem like a real problem in the world of marketing copyrighting where there is solid research on trends and guidelines for catching people’s eyes. But it’s an entirely different thing to consider how children may not be learning different writing styles — how they may not be making mistakes, in fact — because they may be relying on AI chatbots that have been trained on large datasets and standardized a common language style in their responses. In contrast to the hundreds of articles I have found on ‘AI sentience’, ‘AI meaning’, and ‘human-like AI properties’, I have not yet found a single credible article in this space, which is a shame — I do hope this is a field that gets to be more studied as the consequences for neglecting this could have dramatically negative impacts in human language and creative expression; or as one might say, we too could become stochastic parrots.

Summary

Large language models have become an exciting and to some degree controversial new addition to our lives. While some have been impressed with models to the point of attributing human-like characteristics or even sentience, large language models are by and large well understood stochastic models. There is still much we are learning about them, and in the process, we are also learning about ourselves. The impact of large language models on the ways we communicate may be large and we should devote as much time to that question as we tend to for other questions of sentience and meaning. And, of course, to other important questions such as the carbon footprint of training large models, and how we can best avoid biases and disturbing behavior from our wonderful stochastic parrots.

Additional Resources

The following resources may have been used to help me explain concepts, or otherwise are great videos and articles providing further depth and explanations to topics discussed above.

Making sense of the power and philosophy of AI language models

Written by Rafael Guerra