Isaac Asimov, Multivac and ChatGPT

2022 will certainly be remembered as a turning point in the world of AI.

Datalytics

Published in

Datalytics

7 min readJan 3, 2023

Read this story in Spanish.

In January of 2022, OpenAI launched Dall-E 2, and we all stopped to recalculate and try to understand how it worked, before eventually playing with it.

For those who may be out of the loop, Dall-E is a program that creates images from textual descriptions. It’s as “simple” as that. You write a description, and it creates an image.

How was it trained? We can oversimplify it by saying that it learned first from images with descriptions and then learned to associate them. So, if you ask for a panda skating on ice, it will take what it “knows” about pandas, what it “knows” about ice skating, and create a completely new image by combining the two.

As a result of this, it can easily complete images. For example, let’s take Van Gogh’s “Starry Night” and extend it:

It also allows us to keep playing. For example, this is a photo from my vacation, where with just a description and a click, it extended the image and added what I asked for (a menacing T-Rex):

Beyond the fun of all this, what I appreciate is that this is the first time that the advancement in AI has been easily and uncomplicatedly showcased.

That is to say, beyond the usefulness or not of this, the implementation that OpenAI made with Dall-E did more for the dissemination of AI as a hot topic than hundreds of written articles (this being another contribution to the cause 😊).

Additionally, it almost imperceptibly raises a question that will become increasingly important: who is the author of this? Who is the artist? Can a model receive an award?

We quickly become accustomed to the good things

Just when the topic was starting to be exhausted and go unnoticed, on November 30, 2022, OpenAI launched ChatGPT. And once again, they redefined the standard of what is possible, making it visible and easy for everyone once again.

ChatGPT is based on GPT: Generative Pre-trained Transformer. Or in plain English: it is a set of AI models (four in particular) that understand and generate text, with one being the most powerful and efficient: Da Vinci. That is, it interprets what we tell it and responds accordingly.

At Datalytics, we decided to try it by translating from Spanish to English the article published by the great Heidy Villa (How to Fall in Love with Scrum-Powered Analytics Projects Again, if you haven’t read it, it’s highly recommended).

Translating words is easy: it’s a dictionary. Translating sentences is a little more complex. We add more complexity to this task when we change idiomatic roots: Latin-rooted Spanish to Anglo-rooted English. This implies a greater difference in the way sentences are formed.

If we add to this combo that we want to translate an article of more than 1500 words, the task is already really complex. Try doing it with Google Translate and you’ll see the result: a good effort, but a useless result.

The complexity lies in the fact that, to do it well, you have to understand the context and, in this way, rebuild the sentences so that they make sense. Precisely being a task of semantic complexity (what the article says, its meaning) and not just syntax (how it says it, rules used to form sentences).

The result we achieved with the translation was… incredible. In a matter of seconds, we had the article translated. Without a single error. With completely rewritten sentences.

Here you have the original article, and here the translated article: take a few minutes to read it (in the language you want 😉), because it’s worth it.

But what really disconcerted us and showed the potential was that, when validating the results, we had an extra paragraph. We thought it was an error: that when copying and pasting something had escaped us. Or that it had modified the structure of the article, simply adding a line break.

But no.

When translating, ChatGPT had added a paragraph to us. That made sense for the article, for the narrative structure and for the context where it inserted it. A paragraph of its own authorship or whatever that means (although something tells me that in the near future we will get used to that).

Impressive.

Other tests: GPT programming

Since we work with data, we thought it was interesting to test how it interpreted and solved data problems using code. For this, we asked him to solve a basic SQL problem, one that is even part of the exams we take at Datalytics:

He not only solved it, but added a detailed explanation of what he did.

Since we’re at it, we asked him to do the same thing, but using Pandas:

Finally, someone who comments the code, right? Anyway. And finally using PySpark:

Just a week after the launch of GPT, many strange tests or use cases started to appear:

For example, in this article, they ask GPT to simulate being a computer, where they run Python code, lift a Docker container and navigate the Internet. Inclusive, in a beautiful act of recursion, they connect from the simulated machine in ChatGPT with… ChatGPT.
In this Twitter thread they show the result of giving him an IQ test.
In this other one, they asked him to write the Python code for tic-tac-toe.

It’s no coincidence that in just 5 days it has reached a million users, right?

What is all this based on?

This is all based on GPT (Generative Pre-trained Transformer), a massive, general-purpose prediction and generation model.

The key to GPT (and all these types of general-purpose models) is its size: at least version 3 has “only” 175 billion parameters. It’s 100 times larger than its predecessor, GPT-2. The size of the current version (3.5) was not published, but it is assumed to be much larger. In a few months we will have version 4 which will be even more massive and definitely more powerful.

GPT “learned” to relate words, sentences, and paragraphs using public data from the web: Wikipedia entries, social media posts, and various articles. In particular, the datasets used come mainly from Common Crawl (petabytes of data collected since 2008). If anyone feels like playing, these datasets are public and can be accessed from the link above (but watch out for the end-of-month bill: GPT-3 was trained for 9 days on Azure infrastructure and cost a mere USD $4.6 million). Reinforcement learning from human feedback (RLHF) was used to train it, where, like a game with rewards and punishments, the model is improved based on feedback from humans. Here’s a simple explanation of how it works.

Isaac Asimov’s Multivac and ChatGPT

Coincidentally, in 2022, 30 years of the death of the great science fiction master, Isaac Asimov, were celebrated. In the story “The Last Question,” Asimov tells the story of Multivac, a futuristic machine that concentrates the knowledge of the world and answers any kind of question.

In the story, Asimov describes how in the near future, Multivac first solved simple questions, then moved on to more complex ones, then designed spaceships and even solved problems such as generating clean energy.

If we consider that today “knowledge” is concentrated on the internet and that ChatGPT has incorporated it as part of its training and interacts with us by solving queries, the analogy is quite direct.

It hadn’t been a week since the launch and new cases of use keep appearing. Not to mention those that will appear in a matter of months and not to think about future versions (GPT-4), even more powerful and that in a matter of months will be available.

The best way to close this article is by copying the last three sentences of Asimov’s story:

“And Multivac said:

LET THERE BE LIGHT!

And there was light…”

—

Written by Guillermo Watson, CDO @ Datalytics and translated by ChatGPT.