Did GPT3 Write This Story? *

Mike Nicholls
The Startup
Published in
8 min readJul 31, 2020

Earlier this month AI think tank OpenAI released a closed beta version of their Language Modeling platform known as GPT3.

GPT3 is the world’s largest language model by an order of magnitude, essentially it has been trained on the world’s websites using CommonCrawl.org, Wikipedia, and other well-known text corpa, its predecessor GPT2 was one of the world’s most advanced language models trained on 1.7 billion parameters, GPT3 is multiple orders of magnitude larger with 175 billion parameters and pre-trained on nearly half a trillion words on a Supercomputer.

The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server.

Microsoft Blog

So what does a language model do? Essentially, given a sample sentence or question a language model will give you the most likely sentence or paragraph or even full story that corresponds with your initial sample sentence.

We’re releasing an API for accessing new AI models developed by OpenAI. Unlike most AI systems which are designed for one use-case, the API today provides a general-purpose “text in, text out” interface, allowing users to try it on virtually any English language task. You can now request access in order to integrate the API into your product, develop an entirely new application, or help us explore the strengths and limits of this technology.

Given any text prompt, the API will return a text completion, attempting to match the pattern you gave it. You can “program” it by showing it just a few examples of what you’d like it to do; its success generally varies depending on how complex the task is. The API also allows you to hone performance on specific tasks by training on a dataset (small or large) of examples you provide, or by learning from human feedback provided by users or labelers.

Size aside, what is special about this one? In previous models a model was trained and then fine-tuned for specific tasks, GPT3 has reduced the need to provide specific training data and fine-tuning for specific tasks and can operate in what is known as Few or Single Shot mode, it only needs a sentence or example to create meaningful output.

By training the model on such a huge dataset this has essentially created a machine that can write human-sounding sentences that often make just as much sense as some you read on various news sites, but the applications are so much wider.

Because it’s trained on the largest body of text in history, it has learned how to do really useful things. Early beta testers have managed to demonstrate the following

  • Generate working software components and elementary websites in a number of programming languages.
  • A working search engine that can answer questions
  • Perform basic math on 3 digit numbers
  • Writing Investment Memo’s for VC firms
  • Diagnosing symptoms or answering complex medical questions
  • Pattern matching from sample data in spreadsheets and filling out answers for the next input
  • Grading and correcting papers from students
  • Writing Fiction, Songs, Interviews
  • Answering Support Questions
  • Language Translation

Observations

For some years 7 years or so I have been an occasional side project experimenter with AI/ML, I like to run experiments to learn. I am a business guy that asks talented AI freelancers to try to build experiments to solve my problems, I'm certainly not capable of coding an AI model (although I have spent a lot of time experimenting with parameters) nor do I do much coding myself. I am technical from a high-level perspective, I can’t write production code but I can tell you what I need you to build to achieve what I want.

Some of my experiments have included a model that can detect Malaria pathogens in blood cell images, models that can spot defects on roads, a news aggregation and summarisation system, a mobile object detection app that can read out what is in front of you (for vision impaired) and an algo trading app that can run equity trading strategies and backtesting.

My initial observations of GPT3 are as seen through the lens of an investor and technical business guy.

The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server. Microsoft Blog

  • Individual developers and startups, even most large corporations just can’t compete with the moat that this sort of raw compute power provides. The days of developers running these models on their own GPU at home or even on their own instances in the cloud are gone, the model is too large (can you imagine paying the monthly AWS/Azure bill on the V100 GPU enabled servers).
  • The model is huge, it was trained on 45Tb of compressed data from Commoncrawl.org that is housed in special compressed formats on Amazon Web Services and needs to be processed with specialised data ingestion tools, its a non-trivial technical task to do properly.
  • Interestingly a lot of the code to run Commoncrawl was built by our friend and AI expert Smerity originally from Sydney University.
  • IMO this raw compute power combined with massive AI models probably implies that OpenAI is on its way to being one of the next major tech giants, they have a number of AI-based platforms that can perform many AI tasks that Google, AWS, and Microsoft do now without having to involve Developers and use their APIs, they probably have the resources and scale to build a viable AI competitor.
  • It’s also possible a variant of GPT3 could also emerge as a search competitor to Google.
  • Because of the sheer scale, these platform tools will be owned and operated by OpenAI and exposed as an API for developers to use as a service (most other AI software tools can be installed and run as standalone services).
  • GPT3 won’t work for all use cases but the street will work out how they want to use these capabilities, businesses, and developers will experiment, discard the failures, and enhance the successful experiments.
  • It doesn’t appear to update or learn in real-time, it was reportedly trained in Oct 2019, so it might be able to write about COVID but it appears in its current form it is not going to create real-time updated coverage with fresh facts.
  • As these tools become commonly available it is going to become increasingly difficult to separate fact vs opinion vs fiction. GPT3 has the potential to automate the production of news however there is a real risk that it creates readable interesting stories that are fundamentally fake news. (this is not really much different to humans writing fake news, just probably better written and in far greater volumes)
  • It makes the role of editor and fact-checker far more important as there the potential for creating disinformation on a massive scale (given so much news is copy-pasted from other news sites, news with interesting but false assertions spreads very quickly).
  • Who/what can you trust? I think it probably also highlights the need for journalists to become celebrities and become known for the quality and authenticity of their work so that their personal brand is able to act as means of authentication for the story.
  • GPT3 doesn’t have its own opinion or character, but it can probably mimic someone else’s
  • Models like GPT3 probably also drive the need for “Explainable AI” to validate the output from AI models.
  • If GPT3 can assemble the code to run basic neural networks, potentially it can improve itself in the future (especially if paired with a Reinforcement Learning Algorithm)
  • While the first few paragraphs of GPT3 generated text may be very readable, some of the examples I have seen start to wander off topic when running extended examples.

Summary

There are many news stories and hype claiming the release of GPT3 heralds the arrival of General Artificial Intelligence. I don’t believe this is true, it’s really not General AI, it has learned to Ape humans very well for a lot of tasks, but it isn’t reasoning and it performs badly on many tasks that humans do easily (see Sam Altman’s, one of the key OpenAI founders comments below).

Is GPT3 going to replace humans?

Probably some, but not universally. It can probably be tuned to do tasks that don’t require reasoning, design or creativity, repetitive work, where the worker adds very little value, low wage white-collar jobs like writing repetitive legal documents, collecting, collating, transcribing, researching, formatting, assessing, tasks that take significant time but perhaps follow rules or guidelines.

It is difficult to imagine it will take over truly creative or work where the author, their character and the way they write or perform is part of the enjoyment.

Experiments & Examples

Screenshots from experiments found on Twitter and the web.

This is brilliant

Generating a basic Neural Network with Pytorch

Generating Graphs from data using plain English commands

This is brilliant, I hope that it's right

SQL queries from plain English statements

Interesting Posts on GPT3

The original OpenAI paper

GPT3 an AI that's eerily good at writing almost anything

https://medium.com/@julienlauret

Giving GPT3 a Turing test

How do you know a human wrote this?

3-minute explainer on why GPT3 is overhyped

Airtable list of GPT3 Experiments

You can sign up for more deep tech news at Main Sequence Ventures

*Answer to my headline

No GPT3 did not write this article. See Betteridges Law of Headlines

I managed to get access to AI Dungeon one of the apps that has been given early access to GPT3. AI Dungeon is set up to generate stories like an early adventure game so it wants to play a game to give context, it really isnt set up for article writing and some of the commentary that comes back is a bit random

--

--