Why are so many giants of AI getting GPTs so badly wrong?

Introduction

Fergal Reid
9 min readMay 22

--

Some big names in AI (e.g. Yann LeCun, Rodney Brooks, Noam Chomsky) are seriously underestimating the capabilities of large language models.

As evidence, I present specific examples I’ve generated using GPT-4, increasing in sophistication.

The examples take time to read, and its OK to skim them; but I think specific examples are the best way to counter some very sweeping claims.

The underestimating I’m seeing doesn’t reassure me about the AI Safety debate at all; if so many experts are getting it so wrong today, this isn’t encouraging.

Set-up

These are the comments I’m reacting to:

https://twitter.com/ylecun/status/1659642861166403590

The full quote from Brooks writing in IEEE Spectrum, that LeCun is agreeing with:

I.e. LeCun and Brooks are saying LLMs don’t have any underlying world model — “its looking it up”. I.e. there’s no semantic understanding here.

Rich Sutton also seems skeptical LLMs have anything to do with intelligence:

https://twitter.com/RichardSSutton/status/1654643464959819776

i.e. that its ridiculous to think intelligence could arise from text prediction.

The general theme is that these models don’t actually understand anything about the world; that they are just regurgitating data from their training set.

Noam Chomsky et al. writing in the NYTimes also echo this (“The false promise of ChatGPT”)

Their deepest flaw is the absence of the most critical capacity of any intelligence: to say not only what is the case, what was the case and what will be the case — that’s description and prediction — but also what is not the case and what could and could not be the case. Those are the ingredients of explanation, the mark of true intelligence.

Here’s an example. Suppose you are holding an apple in your hand. Now you let the apple go. You observe the result and say, “The apple falls.” That is a description. A prediction might have been the statement “The apple will fall if I open my hand.” Both are valuable, and both can be correct. But an explanation is something more: It includes not only descriptions and predictions but also counter-factual conjectures like “Any such object would fall,” plus the additional clause “because of the force of gravity” or “because of the curvature of space-time” or whatever. That is a causal explanation: “The apple would not have fallen but for the force of gravity.” That is thinking.

But GPT-4 clearly seems to have a world model

Let’s start with the Chomsky example about the apple, and build on it. I’m picking on Chomsky et al. as they were most specific; but I think the same spirit is found widely.

I aim to show you GPT-4:

  1. Making predictions and dealing with counter-factuals
  2. Demonstrating reasonable predictions about scenarios that it could not have encountered in its training data, thus providing evidence its not ‘just looking things up’ or regurgitating correlations. (Or at least that the line between regurgitating correlations and intelligence is unclear!)

The big problem with evaluating ML models is that they can appear very smart if you accidentally ask them something in their training data. We must avoid that mistake.

Let’s set up a complex scenario that is unlikely to have occurred in the training data, and try to focus on GPT-4’s explanatory power in that scenario.

The following is a first-try version of a scenario, made for this blog, not cherry-picked.

Let’s take the classic prank of a bucket of water balanced on a door, and modify it to feature apples. I have Googled, and can find no mention of anything like this online.

Ok, so the system deals with a scenario its likely never encountered before, making a reasonable prediction about what might happen.

That isn’t easy. You’ve got to understand that glue is sticky and will cause the apples to stick and make things messy.

There’s arguably an error here — I’d have said the apples are probably too heavy to stick on contact. I’m not saying GPT-4 is a superintelligence.

But to even get that far, you’ve got to do a lot of reasoning.

Let’s push it a bit:

GPT-4 realizes that over time, the glue becomes less sticky, and realizes the implications of that.

For my money, this is doing the explanation that Chomsky says they can’t, and that seems indicative of a world model.

I’m going to tell it that the apples didn’t fall out, and let it explain that:

And there we go.

This bit about superglue curing faster with water appears to be supported on the Internet. I didn’t know that. That’s not evidence it has a world-model, but as an aside, shows it’s vast database of facts.

Let’s mix things up further, and really ensure we are outside the training dataset:

I find it hard to believe it has seen training data about buckets of apples on the moon. I find it hard that it correctly combining these scenarios can be considered ‘token manipulation’ with no world model.

This seems to well clear the bar set by Chomsky in the original article.

So:

Are we really supposed to accept GPT-4 is just regurgitating data found in the training dataset, and that there’s no model of the world here?
It seems that at a minimum, the burden of proof must shift to those claiming this.

I’m reminded of this video of Richard Feynman explaining how hard ‘general knowledge’ or ‘common sense’ can be. You’ve got to have a lot of subtle knowledge about the world, to generate responses like the above.

Actually, because we take our common sense — our own world model — for granted, I’d say we’re in danger of underestimating quite how complex a world model GPT-4 must have to do the above.

To restate Brooks and LeCun’s position:

I’m really surprised to hear this said so confidently.

I’ll share another example below, to push things a bit further.

But first I want to share some thoughts.

What’s going on here?

Why are some experts so dismissive of the idea there’s a world model at play, given that GPT-4’s performance appears very hard to explain without reference to one?

One thing I’m seeing repeatedly in the discourse is argument of the form:

But transformers are just functions?

Transformers are just functions of the input to the output or We know how transformers work, and they don’t understand anything.

Given the complex behavior we are seeing, this argument feels analogous to saying “Computers are just functions of the input to the output” or “We know how NAND gates work, and they don’t understand anything”.

Those facts may be true in isolation, but the argument here is specious.

It’s possible to build systems that are much more powerful than their constituent parts. We know this from computer science.

Its very easy to stumble upon a new computing system, sometimes quite a simple one, and then later learn its Turing Complete, and thus discover it could do a very broad class of operation.

Stacked transformers may not be Turing Complete; but that doesn’t mean they aren’t a very powerful computing paradigm.

I think in the presence of such sophisticated behavior showing, e.g. reasoning, the burden should shift onto those claiming a GPT cannot reason, on purely architectural grounds.

But the training objective is just token prediction?

All GPTs set out to do is to predict the next token. We just update their weights to optimize for this. Therefore they can’t think.

This argument feels analogous to an alien robot looking at earth and saying: “Well, that planet is an evolutionary system. So the life on it isn’t designed like we were. All they do is set out to spread their genes, and whichever genes have the best fitness spread more. Therefore they can’t think.”

The point being that just because the training objective is simple, or well understood, doesn’t mean that it can’t encourage complex or powerful behavior to arise.

In fact, that’s the whole point.

Also, there’s lots of reasons to suspect that sequence prediction is a task that’s a good measure of general intelligence; deep links between intelligence and compression; see the Hutter prize etc.

So its not even clear that the training objective is simple or low-powered in any sense; and even if it was, a simple training objective is no proof the system won’t evolve complex behavior to optimize it.

A second example.

This example deliberately gets into absurdity, to clearly get outside training data, and also to enable us to ask more testing questions about world models. (Note, I did alter the prompts of this example interactively for clarity; but I wouldn’t describe it as cherrypicked.)

Ok, that’s a reasonable response to our intro.

This is surely very challenging territory. Now we’re checking not just that it appears to have a world model, but whether it can reason about someone whose world model is compromised.

Going a little further:

I think GPT-4 has done a great job here at reasoning about a lot of uncertainty, including uncertainty about whether the character in the story is accuracy perceiving reality.

Are we supposed to believe there’s a lot of examples about motorcycling bears in the training dataset? That the model is just a ‘blurry jpeg’, compressing internet text, per Ted Chiang’s New Yorker article? Maybe that the apparent modelling here is luck?

I find it very hard to explain this behavior, without reference to a world model (or without defining a ‘world model’ so that whether humans have one is called into question!)

I can’t understand how some experts can confidently dismiss that possibility, given this empirical performance.

The Sparks of AGI paper discusses pre-release GPT-4 analysis performed by Microsoft Research, and does a great job exploring GPT-4’s capabilities, so what I’m saying here isn’t outside the discussion.

This only adds to my confusion about why so many respected researchers seem to be missing the forest for the bears.

Again, this doesn’t build confidence in our ability to make predictions concerning AI safety.

Appendix: A third example: Planning

A lot of ‘good old fashioned AI’ was about planning. However, planning systems often failed in the real world, due to an inability to exclude relevant facts, and the resultant combinatorial explosions.

I thought this was a neat example of agent planning using an LLM; again, which seems to be evidence for a world model:

--

--

Fergal Reid

PhD machine learning. Principal ML Engineer.