Generative AI for Beginners: Part 5 — What is Large Language Model?

Raja Gupta
9 min readMar 5, 2024

This blog is part of the series Generative AI for Beginners, where we are learning basics of Generative AI, one simple step at a time.

To make it easy to grasp, I have divided the entire series in small parts. Each blog requires maximum 15–20 minutes to learn. After finishing the series, you will get a clear idea on fundamentals of Generative AI and its various aspects.

Part 1 — Introduction to AI

Part 2 — Understanding Machine Learning

Part 3 — Deep Learning: The Fundamental Pillar of Generative AI Advancement

Part 4 — Introduction to Generative AI

Part 5 — What is Large Language Model (LLM)? [current blog]

Part 6 — Prompt Engineering: The Art of Communicating with AI

Part 7 — Ethical Considerations in Generative AI

Part 8 — Challenges and Limitations in Generative AI

This is the 5th blog in this series where we will learn about large language model.

Let’s quickly recap what we have learnt so far!

Artificial Intelligence (AI)

  • With a simple analogy and example, we learnt what AI is.
  • We learnt capabilities of AI and how it is changing our day-to-day life.
  • We looked into different types of AI with examples.
  • We also understood how AI is different from human intelligence.

Machine Learning (ML)

  • With a simple analogy and example we learnt what machine learning is.
  • We got a clear idea on supervised, unsupervised learning and reinforcement learning.
  • We learnt how ML is different from AI.
  • We looked into real-life examples and applications of ML

Deep Learning

  • We learnt how deep learning is inspired from human brain.
  • We understood how artificial neural network works.
  • We learnt how deep learning is used to solve complicated problems.

Generative AI

  • With a simple analogy and example, we learnt what Generative AI is.
  • We understood how Generative AI is different from AI.
  • We looked into real-life examples and applications of generative AI.

Now, let’s continue our journey and try to understand a concept that powers the AI system to communicate with Human — Large Language Model.

Where does Large Language Model fits into Generative AI?

Let’s take an example of ChatGPT to understand it clearly. Out of many capabilities of ChatGPT, one is to understand human language (questions asked in plain English). It can also generate response which we human can understand. This capability of ChatGPT, to communicate with humans, is powered by — Large Language Models.

In other words, we can say — A generative AI system which needs to generate human-like text needs Large language models.

Let’s break down it further in layman’s terms!

What is Language Model?

Let’s first understand what a language model is.

Language model is:

  • a type of machine learning model
  • which uses various statistical and probabilistic techniques
  • to predict probability of a given sequence of words in a sentence or phrase.

In simple words, language model is designed to predict next most suitable word to fill in a blank space in a sentence or phrase, based on the context of the given sentence/phrase.

Let’s take an example to understand better!

When we use messaging apps in phone, it helps us by predicting the next word when we type in a message. For example, as soon as we type “how,” the phone might suggest words like “are” or “is” because it knows that those words often come after “how” in sentences.

Similarly, if we type “I am going to,” the phone might predict words like “store,” “park,” “office”, or “beach” because those are common words that comes after “going” in everyday language.

This prediction is made based on the context of what we have typed so far and the patterns it has learned from analyzing lots of text.

Large Language Model (LLM)

A large language model (LLM) is a language model which is:

  • a type of machine learning model
  • that is trained on a large dataset of text
  • and uses advanced neural network architectures
  • to generate or predict human-like text.

Coming back to our earlier example, it is the language model that helps AI tools to predict upcoming words in a sentence.

Below image summarizes important points about large language model.

The most unique and powerful point about large language models is their ability to generate human-like text, summarize, and predict content based on vast amounts of data. LLMs can process and analyze vast amounts of text data, making them highly proficient in language processing tasks such as text generation, summarization, translation, and sentiment analysis.

Natural Language Processing (NLP)

Natural Language Processing is an important concept very much linked with LLM.

Natural Language Processing (NLP) is a subset of AI, which focuses on the interaction between computers and humans through natural language (say English).

  • NLP refers to the process of enabling computers to understand human language and communicate with us in the same language.
  • NLP uses algorithms to analyze, understand, and generate human language.
  • It also helps computers understand the context, and sentiment behind words and sentences.

Let’s take another example to understand NLP better. Virtual assistant, for example Siri, can understand and respond to our commands using NLP.

Imagine you ask Siri, “Set an alarm for 7 AM tomorrow.”

  • Siri’s NLP algorithms analyze the sentence, breaking it down into individual words and understanding their meanings, grammar, and context.
  • The NLP algorithm will be able to understand the user’s intent, which is to set an alarm.
  • Further, Siri does the action specified in the command, setting an alarm for 7 AM the following day on your device.
  • Finally, Siri will give a response in your language.

Natural Language Processing is the backbone for tasks such as responding to human (e.g. ChatGPT), language translation, search engines etc.

Natural Language Processing (NLP) and Large Language Model (LLM)

Large Language Models may be considered as an evolution of Natural Language Processing models. In other words, we can say that a large language model is any model designed for NLP tasks having focussed on understanding and generating human-like text.

While NLP includes a broad range of models and techniques for processing human language, LLMs focus on understanding and generating human-like text. LLMs are specially designed to predict the probability of a word or sentence based on the words that come before it, allowing them to generate coherent and contextually relevant text.

From machine learning technique point of view, natural language processing uses a wide range of techniques, ranging from rule-based methods to machine learning and deep learning approaches.

On the other hand, large language model mainly uses deep learning techniques to understand patterns and context in text data to predict probability of next word in the sequence. LLMs are designed based on artificial neural network architecture. Most of the large language models are based on transformer-based models.

How is Large Language Model related with Generative AI?

Large Language Model (LLM) are a subset of Generative AI. While generative AI can generate many types of content such as text, image, video, code, music etc., LLM is focussed on generating text only.

Where/How Large Language Models are used?

Large Language Models (LLMs) are used in various AI applications across different industries. Here are some major examples:

Virtual Assistants

LLMs models are the engine that power virtual assistants for example Siri, Alexa, or Google Assistant. It’s the LLM models that analyze the human command and interpret the meaning out of it, helping these virtual assistants to perform several actions on user’s behalf.

Chatbots

ChatGPT is not a new word anymore. Most of us have used it or similar AI conversational chatbots. These chatbots uses large language models to understand human questions and response in a way that mimic human-language.

Language Translation

Large language models play an important role in language translation done by AI tools such as Google Translate. These models are trained on huge amount of multilingual text data, which enable them to capture the subtle distinctions, variations, context, and complexity of different languages.

When we asked translation tools to translate a sentence, it uses the LLM algorithms to analyze the input text in one language and generate an accurate and contextually appropriate translation in the target language.

By considering the relationships between words and phrases in both languages bidirectionally, LLMs can produce translations that preserve the meaning and tone of the original text.

Text Generation

Now-a-days large language models are used in many applications to generate human-like text. These models are so sophisticated that they can generate coherent and contextually relevant text based on a given prompt or input. LLM models can be used to compose stories, generating product descriptions, write emails and many more.

Summarization

Large language models are very useful for doing document summarization. Using natural language processing capabilities, LLM models can summarize lengthy documents or articles into concise summaries while preserving the key information and main points. Using techniques such as attention mechanisms and contextual understanding, LLMs can determine the most salient information to include in the summary, ensuring that it captures the essence of the original text.

Sentiment Analysis

Sentiment analysis is a process to determine the sentiment or emotional tone expressed in a text. Large language models can be used to analyze huge amounts of text data, understand the context, nuances, and tone of language, and identify sentiment polarity (positive, negative, or neutral).

Many organizations now-a-days use large language models to identify sentiments in text data coming from social media posts, product reviews, customer feedback, news articles etc.

Content Recommendations

Large language models (LLMs) are being increasingly used by platforms such as Netflix, YouTube, Amazon etc., for content recommendations to provide users with more personalized and relevant suggestions. These models capture the relationships between words, phrases, and topics, allowing them to understand the meaning and context of content. When it comes to content recommendations, LLMs analyze a user’s interactions with content, such as articles they’ve read, products they’ve bought, or videos they’ve watched. Based on this data, LLMs can predict what other content a user might be interested in and suggest relevant options.

Some Popular Examples of Large Language Models

Here are some of the popular applications which uses large language models.

GPT (Generative Pre-trained Transformers)

Generative Pre-trained Transformer is probably the most popular large language model, which is used in ChatGPT. After the introduction of transformer architecture in 2017, OpenAI released GPT-1 as their first transformer based large language model in 2018. GPT-1 was initially trained on BookCorpus, a dataset consists over 7000 self-published books.

Subsequently, OpenAI released more advanced version of GPT as GPT-2, GPT-3, GPT-3.5 and GPT-4. All these are transformer-based large language models. GPT-4 is a multimodal model, which means it can take images as well as text as input.

BERT (Bidirectional Encoder Representations from Transformers)

Introduced by Google in 2018, BERT is a transformer-based large language model. BERT represents a significant advancement in the field of large language model and natural language processing. It’s a bidirectional transformer model which allows it to process words in parallel, making it more efficient compared to traditional sequential models like recurrent neural networks (RNNs).

LaMDA (Language Model for Dialogue Applications)

LaMDA is conversational large language model, developed by Google, which is also a transformer-based model. After the sudden rise of ChatGPT, Google announced it’s own conversational AI chatbot called “Bard”. Bard is powered by LaMDA.

Later, Google introduced PaLM (Pathways Language Model), as the successor of LaMDA. Further, in 2024, Google rebranded Bard with the new name “Gemini”. Gemini is powered by large language model (LLM) of the same name. Gemini multimodal large language model is the successor to LaMDA and PaLM.

LLaMA (Large Language Model Meta AI)

LLaMA (Large Language Model Meta AI) is a set of large language models (LLMs), introduced by Meta AI. LLaMA is an auto-regressive language model, is built on the transformer architecture.

I hope that by now, you have got a clear idea on large language model. If you still have any query, please let me know in comment or get in touch with me in LinkedIn!

Next Blog

Part 6 — Prompt Engineering: The Art of Communicating with AI

Enjoyed Reading? Follow me for more such insights on AI and beyond!

--

--

Raja Gupta

Author ◆ Blogger ◆ Solution Architect at SAP ◆ Demystifying Tech & Sharing Knowledge to Empower People