From ancient oracles to Generative AI

A story about human’s desire to predict and influence the future

Anna Via
Artificial Corner
8 min readDec 3, 2023

--

Picture by Drew Beamer in Unsplash

Since the launch of ChatGPT at the end of last year, there hasn’t been a single week without news related to new AI models, products, or features. But how did we get here? And why now? In this blog post, we’ll cover a story that started centuries ago and has brought us to generative AI this year. Get ready to travel around the ages and discover the most important milestones of human’s pursuit to predict and influence the future.

Humans have been trying to predict things for ages

The story starts as far back as the 8th century BC in Delphi, a famous sanctuary dedicated to Apolo in ancient Greece. Seekers of wisdom and guidance would travel for days, or even weeks, to meet Pythia, the priestess of Delphi. Pythia, also known as the Oracle of Delphi, would deliver cryptic prophecies to those seeking knowledge, influencing the course of lives, families, and cities. Those who sought counsel made huge sacrifices, including time and money investments to travel from all over Greece and expensive tributes that still impress any tourist visiting Delphi today. In exchange for all those sacrifices, travelers would get “predictions” or advice in the form of enigmatic prophecies. Though there is now controversy on whether Pythia’s powers were fueled by hallucinogenic gasses, Delphi remains a precursor to a fundamental human desire: to foresee and control the future.

Delphi, picture by the author (on my last trip to Greece!)

Now really, humans have been predicting things for centuries

Just like any other form of divination, the Oracle of Delphi completely lacked any sort of scientific foundation. Their predictions were abstract and enigmatic, so they could be interpreted as wanted, and therefore it was hard to tell if they had been right or wrong. Scientifically based predictions arrived much later. Formal discussions on inference date back to Arab mathematicians and cryptographers during the Islamic Golden Age, 8–14th century. In that age permutations and combinations were used for the first time and there is also proof of an early example of statistical inference applied to decode encrypted messages.

However, the field of statistics as we currently know it arrived much later, around the 19th century. It was during that time that concepts like standard deviation, correlation, regression analysis, Pearson distribution, or method of moments were defined. Statistics started to be used in new fields, such as for studying human characteristics, but also in industry or politics.

Some years later, between 1920 and 1935, concepts like the design of experiments, sufficiency, type II errors, or confidence intervals appeared. That means that since the 30s and up until today, statistics have been applied to improve decision-making and perform inferences based on data. A good example of human progress thanks to these kinds of techniques can be found in medicine. There, experiments in the form of clinical trials have ensured the right conclusions about the benefits and counter effects of new medicines.

Machine Learning is older than you think

While statistics kept evolving and became more widely used in different fields, another line of research culminated in 1958. The first trainable neural network, the Perceptron, was demonstrated by Cornell University psychologist Frank Rosenblatt. Back then it had the form of a machine and was designed for image recognition. Although Rosenblatt already talked about this machine as something that would evolve to be able to talk, write, and be conscious, soon the limitations on the perceptrons were made clear. These limitations were related to the number of classes it was able to be trained on, and made research in that direction to cool down for some years.

Perceptron in a schema (image by the author)

Machine Learning gets stronger

For some years, there was some progress in the field of Neural Networks, which included feedforward networks, stochastic gradient descent, and backpropagation. But it wasn’t until 1990–2006, that the field gained importance as real applications of Neural Networks started to appear, and the Deep Learning subfield emerged. The main levers of that revolution were:

  • Bigger volumes of data available: it was around that time that the term Big Data was popularized
  • More computational power: also around that time the concept of cloud computing started to be a thing

This unblocked crazy types of prediction models that included Computer Vision (image classification, facial recognition…) and Natural Language Processing (sentiment analysis, translators…).

Example of Sentiment Analysis (image by the author)

Another type of model appeared, usually considered under the umbrella of prescription solutions (trying to make something happen, instead of “limiting” to predicting it). Recommender systems are a great example of this: using Machine Learning or Deep Learning models to influence a user to watch a movie (e.g. Netflix), listen to a song (e.g. Spotify), buy an item (e.g. Amazon), click and read a link (e.g. Google)… Recommender systems rely, again, on big volumes of data (usually interaction data between users and products) and big computational power.

Getting closer to Artificial Intelligence

We had had plenty of cool models and applications up until then, but in 2017 a paper happened. That paper was called “Attention Is All You Need”, written by several researchers mainly from Google. The paper introduced a new type of Deep Learning Neural Network, called Transformers. This new architecture had three significant new abilities compared to other neural networks:

  • Learn from (even) more data
  • Scale efficiently through parallel processing
  • Put attention to input meaning

That changed everything. Shortly after the paper, transformer-based models started appearing. Initially, these models, such as BERT, were used as pre-trained models: they had been trained on a vast amount of text and therefore “understood” text and language which gave teams the capability to use much more powerful models, without the need to train them from 0. However, each use case, or each domain even, would need to fine-tune the pre-trained model on a specific prediction task.

Further research and progress were made on top of the transformer’s architecture. This included a deeper exploration of Transfer Learning, multi-task learning, instruction fine-tuning, and Reinforcement Learning from Human Feedback (RLHF). In part, all this work culminated with the launch of ChatGPT at the end of 2022. Just like BERT, GPT models are based on transformer architecture but differ in size (GPT3’s 175 billion parameters vs BERT’s 340 million parameters), objective, and other characteristics.

In the case of GPT, its objective is to generate human-like text. It definitely manages to do so, impressing people from around the world due to how human-like and smart its responses feel. All of us soon got really creative on what to ask it, going from poems and rap songs, to holiday planning or business plans and roadmaps. For the first time too, these models didn’t need to be fine-tuned for specific tasks: for certain tasks, for example, sentiment analysis, you could just ask it to respond with the sentiment (positive, neutral, negative) from a sentence.

In parallel, something similar was happening for images, with the explosion of the field Stable Diffusion that can generate images (and now videos too) from text. Dall-e, also from OpenAI, is an example of this method. It is based on a multimodal version of GPT3 that combines transformer architecture and Diffusion Decoder.

What’s left for the future?

It is really hard to make predictions on how this field will evolve in the future, especially as all progress seems to happen way faster than anyone would expect. Other than going for hallucinogenic gasses like Pythia, I’ve tried my best to put together what I believe will mark the future of AI:

  • AI for everyone, democratization of AI: everybody will get more and more access to AI products and find ways to increase productivity, creativity or integrate them into their day-to-day lives as new commodities.
  • AI gets even more important for companies: more companies and products will include AI, as all this progress unblocks new business models and ways to deliver more value for the company’s clients or users.
  • New and more powerful model versions, anything-to-anything models (from text to audio, from audio to video, from…), and interaction AI or agents (models that are able to use tools to make things happen, call other models as needed…).
  • Ethics, potentially harmful applications, and risks to data privacy will be more relevant than ever, mainly due to how AI will be a bigger part of our lives. Thankfully, new and specific legislation around AI is already on its way to trying to mitigate risks (EU’s AI Act is a good example).

Wrapping it up

From ancient oracles to Generative AI: the story (image by author)

Throughout this blog post, we’ve seen a fascinating evolution in human’s search for tools to predict the future. Interestingly, no matter what the tool to predict was, it worked the same way: you provide an input, and the “tool” or “model” provides an output.

Input/output model schema (image by author)
  • With Pythia, the input could be a person sharing his or her story, and the output, a cryptic response to whether he or she would get married.
  • With Statistics, the input could be the results of a clinical trial for a new medicine, and the output, whether the new medicine would work better in the future than the current one.
  • With “traditional” Machine Learning, the input could be a sentence, and the output its sentiment.
  • And finally, with generative AI, inputs can be questions in the form of prompts, and the output text that looks quite like a human response to your question.

And in the future? One thing is for sure: AI’s future looks exciting!

References

[1] Pythia, Oracle of Delphi https://en.wikipedia.org/wiki/Pythia

[2] Statistics https://en.wikipedia.org/wiki/Statistics

[3] Perceptron https://en.wikipedia.org/wiki/Perceptron

[4] Attention is all you need https://arxiv.org/abs/1706.03762

[5] GPT vs BERT

--

--

Anna Via
Artificial Corner

Machine Learning Product Manager @ Adevinta | Board Member @ DataForGoodBcn