A 3 Minute Explainer on Why GPT-3 is Overhyped.

Ayush Sharma
3 min readJul 20, 2020

--

OpenAI has just released beta access to its latest large-scale language model. GPT-3 is trained on massive datasets that span the entirety of the Internet and contain 500B tokens. It also stands at an astounding 175 Billion parameters, a more than 100x increase over its predecessor GPT-2.

GPT-3 is a generative language model — that is — it assigns probabilities to sequences of text and then generates strings of text that maximize that likelihood given a certain task.

“GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. We find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans.”

GPT-3 paper

You’d be surprised at how many people in tech believe there’s a straight-line projection from GPT-3 → GPT-N → General Intelligence. That’s like early cavemen discovering fire thinking they’ve solved nuclear fusion.

Artificial Intelligence is a field that attracts an unusual amount of hype and euphoria due to its publicly enticing nature. It is, in fact, littered with overhyped products that fell prey to its own success.

Self-driving cars are always five years into the future, Artificial General Intelligence is just around the corner, and Superintelligence is about to take over the world at any moment.

I think we will all be well served to put GPT-3 in its proper context. So here’s my contrarian take on GPT-3 👇

GPT-3 has little semantic understanding, it is nowhere close to AGI, and is basically a glorified $10M+ auto-complete software.

GPT-3 like models, as impressive as they may be, are nowhere close to Artificial General Intelligence (AGI) due to the following caveats —

  • No semantic understanding
  • No causal reasoning
  • No intuitive physics
  • Poor generalization beyond the training set
  • No “human-agent” like properties such as a Theory of Mind or Agency.

Here’s an illustrative example of GPT-3 failing a simple logical question:

GPT-3 fails to understand a simple but uncommon question

Any 5-year-old would understand and answer this question correctly, but GPT-3 doesn’t.

Here’s another excellent writeup demonstrating how GPT-3 fails:

“Remarkably, the easiest way to trip it up is to ask it somewhat nonsensical questions like ‘how many schnoozles fit in a wambgut?’ because statistically, most of the time the AI has seen a question like that on the internet, they are typically answered with a statement that is structured like ‘3 Xs fit in a Y’, so it answers with ‘3 schnoozles fit in a wambgut’, rather than a more appropriate answer which would be ‘those are made-up objects’ or ‘I don’t know’.”

Delian Asparouhov, in his latest Substack issue

To summarize, GPT-3 is impressive in its own right. And yes, it has the potential to transform all things writing. But, as is the case with all generative language models, GPT-3 simply assigns probabilities to strings of tokens and predicts the next likely set of words given a prompt. It remains a glorified auto-complete that has the backing of the Internet-level knowledge repository along with the magic of basic NLP.

As for AGI, it remains a distant, lofty, and a challenging goal. It will be solved eventually by fundamental breakthroughs, especially in areas such as causal reasoning and inference, but not by throwing ever more data and compute at GPT like models.

Finally, I’ll close with Sam Altman — the CEO of OpenAI himself— remarking on the current GPT-3 hype. Humility goes a long way in a leader.

If you found this useful, check out my Twitter where I regularly write at the intersection of Tech and Psychology 👉 Ayush Sharma’s Twitter

--

--

Ayush Sharma

AI Master’s and Undergrad @MIT || There are only two forces in the world — Technology and Human Nature. I write at the intersection of both.