ChatGPT: Did anyone remotely PREDICT the full capabilities and success of LLMs?

Paul Pallaghy, PhD
11 min readApr 24, 2023

I’ve asked this myself many times since 2018 when GPT-1 came out. Thnx to Paul T Lambert for speculating about this.

T5 (2019) came out just after GPT-1 (2018) and is locally installable and remains useful to this day, but vastly inferior to any GPT. CREDIT | Google

Did anyone predict that LLMs (large language models) like GPT would be able to compose stories, explain topics, follow instructions, do arithmetic, write stand-up comedy or slip into requested styles . . all by ‘simply’ learning to predict the very next word?

Only a few public comments

Pre 2019, I certainly did not remotely expect ‘next word’ prediction to result in language understanding or instruction fulfilment. Not in a million years.

Few journalists or researchers are writing about this either.

Stephen Ornes, however, has recently in March 2023 quoted Stanford researcher, Rishi Bommasani:

“That language models can do these sort of things was never discussed in any literature that I’m aware of,”
Rishi Bommasani
Stanford University

The Unpredictable Abilities Emerging From Large AI Models
Stephen Ornes
Quanta Magazine

This confirms my understanding.

Look at my excitement with GPT-2 back in 2020

--

--

Paul Pallaghy, PhD

PhD Physicist / AI engineer / Biophysicist / Futurist into global good, AI, startups, EVs, green tech, space, biomed | Founder Pretzel Technologies Melbourne AU