Glossary — Large language models

Published in

Towards Generative AI Applications

5 min readMay 6, 2023

In this article, I’ll break down some essential terms and concepts related to LLMs and AI in a way that’s easy for non-data scientists to understand. I’ll cover everything from neural networks to data augmentation, with simple explanations and examples for each term. So, let’s dive into the exciting world of computer brains and language learning!

LLM Glossary. Image created using Adobe Firefly with prompt: Meanings of all difficult words in Artificial Intelligence and machine learning

Artificial Intelligence (AI): It’s like a smart robot that can think and do things like humans. AI helps computers solve problems, make decisions, and understand our language. Example: Siri on your iPhone.
Deep Learning: It’s a way computers learn from many examples, like how you learn from experience. Deep learning uses special computer programs called neural networks to find patterns in data. Example: A computer learning to recognize cats in pictures.
Neural Network: A computer program that works like the human brain, using connected nodes (like brain cells) in layers. Example: A computer “brain” that can play a video game.
Transformer: A particular type of neural network created by Google to understand and generate language in a better way. Example: A computer that can chat with you like a friend.
Large Language Model (LLM): A computer program that learns to understand and create human language by studying lots of text. Example: A computer that can write a story or answer your questions.
Parameters: The parts of a neural network that are adjusted during training to help it learn. Example: Like tuning a guitar to make it sound better.
Positional Encoding: A way transformers remember the order of words in a sentence. Example: Remembering that “the dog chased the cat” differs from “the cat chased the dog.”
Self-Attention: A way transformers focus on the most essential parts of a sentence. Example: Knowing that “cake” is the key word in “I want to eat cake.”
Encoder: Part of a transformer that helps it understand and remember what you tell it. Example: A computer remembering the question, “What’s the weather like today?”
Decoder: Part of a transformer that helps it create a response or answer. Example: A computer replying, “The weather today is sunny and warm.”
BERT: A transformer model that helps computers understand language for tasks like guessing what people think about a movie. Example: A computer that knows if a review is positive or negative.
GPT-3 and GPT-4: A transformer model that helps computers generate text like a human, such as completing a sentence or writing a summary. Example: A computer writing a book report for you.
T5: A transformer model that’s good at both understanding and generating text, like translating one language to another. Example: A computer that can translate English to Spanish.
Unsupervised Learning: When a computer learns patterns without being told what’s right or wrong. Example: A computer learning to group similar pictures together.
Foundation Models: Big AI models, like LLMs, can be used for many different tasks. Example: A computer that can help with homework, write emails, and tell jokes.
Zero-Shot Learning: When a computer can do a task without being trained on it. Example: A computer that can play a new game without practicing first.
Few-Shot Learning: When a computer can learn a new task with just a few examples. Example: A computer that can learn your favorite songs after hearing them once or twice.
Fine-Tuning: Adjusting a trained model to be better at a specific task. Example: Teaching a computer to understand and answer questions about dinosaurs.
Prompt Tuning: Changing the way you ask a computer a question to get a better answer. Example: Asking, “What’s the capital of France?” instead of “Where’s Paris?”
Adapters: Tiny parts you can add to a trained model to help it do a specific task without changing it too much. Example: Adding a new skill to a computer game character without changing the whole game.
Natural Language Processing (NLP): Teaching computers to understand, interpret, and create human language. Example: A computer that can chat with you or read your essay.
Natural Language Understanding (NLU): Teaching computers to understand and find meaning in human language. Example: A computer that knows the difference between “I like cats” and “I don’t like cats.”
Natural Language Generation (NLG): Teaching computers to create human-like text. Example: A computer that can write a story or a poem.
Tokenization: Breaking text into words or parts of words, called tokens, to help computers understand language. Example: Splitting the sentence “I have a dog” into tokens: “I”, “have”, “a”, and “dog”.
Vocabulary: The set of unique words or tokens a computer program can understand. Example: A computer knowing the words “apple”, “banana”, and “orange” but not “kiwi”.
Pretraining: The first step in training an LLM, where it learns language from lots of text. Example: A computer reading lots of books and articles to learn how to write.
Transfer Learning: Using what a computer learned from one task to help it do another related task. Example: A computer that learned to recognize cats using that knowledge to recognize dogs.
Sequence-to-Sequence (Seq2Seq) Model: A type of model that changes one sequence, like text, into another sequence, like a translation. Example: A computer turning English text into French text.
Attention Mechanism: A way for computers to focus on important parts of the input when creating an output. Example: A computer knowing “pizza” is the most important word in “I want to eat pizza.”
Beam Search: A method for finding the best sequence of words when a computer generates text. Example: A computer choosing the most likely next word in a sentence.
Perplexity: A way to measure how well a computer can predict text. Example: A lower perplexity means a computer is better at guessing what word comes next in a sentence.
In-Context Learning: When a computer can change its behavior based on the input, without extra training. Example: A computer knowing how to answer a question about sports after talking about sports.
Data Augmentation: Making a dataset bigger and more diverse by creating new samples, like rephrasing sentences. Example: Changing “The cat is on the mat” to “The cat sits on the mat.”
Bias: When a computer makes mistakes because its training data isn’t balanced or representative. Example: A computer thinking all doctors are men because it mostly reads about male doctors.
Explainable AI (XAI): Making computers’ decision-making processes easier for humans to understand. Example: A computer explaining why it thinks a certain movie is a comedy.

Glossary — Large language models

Written by Prasad Thammineni