21 Generative AI Jargons Simplified: Part 3— Foundation Model

Raja Gupta
5 min readJul 4, 2024

This is part of the blog series, where I am explaining 21 frequently used Generative AI jargons in simple words. To make it super simple and interesting, I will provide an analogy as well.

Note: I am publishing it as a series. Part 4 to 21 are yet to be published.

You may subscribe me to get an email when I publish the next blog in this series.

  1. Prompt Engineering
  2. AI Model
  3. Foundation Model [Current Blog]
  4. AI Hallucination
  5. Retrieval-Augmented Generation (RAG)
  6. Grounding
  7. Natural Language Processing (NLP)
  8. Explainable AI
  9. Prompt Injection Attack
  10. Overfitting and Underfitting
  11. Multimodality
  12. Autoencoders
  13. Computer Vision
  14. Transfer Learning
  15. AI Detectors
  16. Adversarial Attacks
  17. Data Augmentation
  18. Generative Adversarial Networks (GANs)
  19. Variational Autoencoders (VAEs)
  20. Transformer-Based Models
  21. AI Poisoning Attacks

This is 3rd blog in this series. The jargon is Foundation Model.

Side Note: I strongly recommend you to go through the previous blog series Generative AI for Beginner. It will only take 90 minutes of your time. No perquisites.

Let’s start!

The problem that leads to invention of Foundation Model

Building and training large-scale AI models from scratch requires significant computational resources, time, and data. A fully functional AI model might costs hundreds of millions of dollars. What if there is an AI model which is already trained on huge amount of data and can be adapted for wide range of use cases?

This is the idea behind foundation model.

An Analogy to Understand Foundation Model

Imagine foundation model as the Swiss Army knife. The Swiss army knife has everything — a knife, scissors, a bottle opener, a tiny saw, tweezers, a toothpick, and many more. It can be used to perform variety of tasks.

Similar to the Swiss army knife, foundation model has a set of different capabilities packed into one powerful algorithm. You need an AI model to generate image, or generate text, or extract information, or identify objects, just take the foundation model and adapt it as per your need.

So, instead of building a separate AI model for each task, you can just take the foundation model — your AI Swiss Army knife — and handle a variety of tasks.

So, What Exactly is Foundation Model?

A foundation model is a deep learning model which is trained on huge amount of data, usually with unsupervised learning.

The model can be adapted to perform a wide range of tasks such as text generation, image generation, video generation, sentiment analysis, information extract etc.

Foundation models can be considered as general-purpose technologies that can support a diverse range of use cases. Since these models are pre-trained on vast datasets using powerful hardware and techniques, they can be used to save time and computational costs compared to training new models for each specific application.

Below image summarizes important points about foundation model.

Foundation models leverage the concept of transfer learning, where knowledge gained from solving one task is applied to another related task. By fine-tuning a pre-trained foundation model with domain-specific data or adjusting its parameters, developers can tailor it to perform well in specific contexts or improve its accuracy on particular tasks.

Example of Foundation Models

Some popular examples of foundation models are language models (LMs) like BERT from Google and GPT from OpenAI. Further, new foundation models have been created such as DALL-E, MusicGen etc.

What’s so Unique about Foundation Models

Due to the heavy cost and hardware requirement, it’s not possible for everyone (even big organizations) to build AI models. It costs millions of dollars and huge effort to develop a foundation model. To give you some examples:

BERT, one of the first bidirectional foundation models, released in 2018, was trained using 340 million parameters and a 16 GB training dataset.

GPT, released by OpenAI in 2023, is trained using 170 trillion parameters and a 45 GB training dataset.

However, these foundation models are super useful in the long run. These models are adaptable and can be further trained to perform specific tasks. Hence, it’s faster and cheaper for us to use pre-trained foundation models and develop more specialized AI applications.

An Example to Show How Foundation Model Can be Used

Imagine we want to set up a customer service chatbot to help answer customer queries for an online store.

Without a Foundation Model

In case we need to setup this customer service chatbot without foundation model, then we will have to do everything from scratch.

  • We will need to collect a vast amount of customer interaction data.
  • We will have to train the chatbot model from scratch, which is time-consuming and requires significant computational resources.
  • We will also need to manually program responses and update the chatbot regularly to handle new types of questions or issues.

With a Foundation Model

If we use a foundation model, then things will become much easy and cost-effective.

  • Instead of starting from scratch, we can use a pre-trained foundation model, say OpenAI’s GPT-3, which is already trained on a wide variety of text data and understands human language very well.
  • We can fine-tune the foundation model with a smaller, specific set of data from your online store’s customer interactions. This helps the model understand common questions and concerns your customers have.
  • Now, your chatbot can handle a wide range of queries with minimal additional training.
  • Deploying: Now, your chatbot can handle a wide range of queries with minimal additional training. It’s capable of understanding and responding to questions about order status, return policies, product details, and more, providing relevant and accurate answers.

To summarize, using foundation model saves time and resources. We don’t need to build and train a model from scratch.

Next Blog

21 Generative AI Jargons Simplified: Part 4 —AI Hallucination [To be published]

If you have come so far, Please Give a Clap to Appreciate My Hard Work.

Follow me for more such content on AI, SAP and beyond!

--

--

Raja Gupta

Author ◆ Blogger ◆ Solution Architect at SAP ◆ Demystifying Tech & Sharing Knowledge to Empower People