8 Types of Prompt Engineering

Amir Aryani
14 min readDec 2, 2023

There are 8 prompt engineering methods:
(1) Zero-Shot Learning, (2) One-Shot Learning, (3) Few-Shot Learning,
(4) Chain-of-Thought Prompting, (5) Iterative Prompting,
(6) Negative Prompting, (7) Hybrid Prompting, and
(8) Prompt Chaining

Prompt engineering is a technique used to effectively communicate with large language models (LLM) like GPT-3 or GPT-4 and get the desired output. Here there are applications of 8 distinct prompt techniques to interact with Mistral AI and find out how we can prompt effectively to learn about Foundation Models.

  1. Zero-Shot Learning: This involves giving the AI a task without any prior examples. You describe what you want in detail, assuming the AI has no prior knowledge of the task.

Prompt: “Explain what a large language model is.”


Mistral Response:

A Foundation Model in artificial intelligence refers to a large,
powerful, and versatile machine learning model that has been trained on a
vastamount of data and is capable of performing a wide range of tasks.
It is called a foundation model because it serves as a base or starting
point for various applications and can be fine-tuned or adapted for
specific tasks. Foundation models are typically pre-trained on
unstructured data such as text, images, and audio, and can be used for
natural language processing, computer vision, speech recognition, and
other tasks without requiring additional training. They are designed to be
highly flexible and adaptable, allowing them to handle new tasks and
domains with minimal fine-tuning. Some examples of foundation models
include BERT, RoBERTa, and DALL-E.

2. One-Shot Learning: You provide one example along with your prompt. This helps the AI understand the context or format you’re expecting.

Prompt: “A Foundation Model in AI refers to a model like GPT-3,
which is trained on a large dataset and can be adapted to various tasks.
Explain what BERT is in this context.”

Mistral Response:

BERT (Bidirectional Encoder Representations from Transformers)
is a type of pre-trained deep learning model that uses the transformer
architecture.

It was developed by Google and is considered to be one of the most
powerful language models in the field of natural language processing
(NLP). BERT is trained on a large dataset of text, such as Wikipedia
articles or books, using a technique called masked language modeling. This
involves randomly masking parts of the text and trying to predict what
those parts are based on the surrounding context. The model can then be
fine-tuned for specific NLP tasks, such as question answering or sentiment
analysis, without requiring a large amount of task-specific data.

In the context of foundation models, BERT is considered to be one of the
most effective and widely used models for various NLP tasks. It has been
shown to outperform previous state-of-the-art models on a wide range of
tasks, and its pre-trained representations have become an important
resource in the field of NLP.

Note we had to define the context, otherwise BERT might be described as Building Employees Redundancy Trust (https://www.bert.com.au).

3. Few-Shot Learning: This involves providing a few examples (usually 2–5) to help the AI understand the pattern or style of the response you’re looking for.

Prompt: “Foundation Models such as GPT-3 are used for natural language
processing, while models like DALL-E are used for image generation.
How are Foundation Models used in the field of robotics?”


Mistral Response:

Foundation Models such as GPT-3 and DALL-E are not typically used
directly in robotics. Instead, they are often used to develop the
underlying technologies that enable robots to perform various tasks.

For example, natural language processing (NLP) is a field of AI that
focuses on enabling machines to understand and generate human language.
GPT-3 is a powerful NLP model that has been used to develop many natural
language applications, including chatbots, virtual assistants, and
language translation software. These applications can be integrated into
robotics systems to enable them to interact with humans in a more natural
and intuitive way.

Similarly, image generation models like DALL-E are used to create
realistic images and videos that can be incorporated into robotic systems
to enhance their visual capabilities. For example, a robot designed for
use in a manufacturing plant might use an image generation model to
recognize and track objects on an assembly line, or to provide real-time
feedback to operators about the robot's performance.

Overall, while foundation models are not used directly in robotics, they
play an important role in enabling robots to perform various tasks more
effectively and efficiently.

4. Chain-of-Thought Prompting: Here, you ask the AI to detail its thought process step-by-step. This is particularly useful for complex reasoning tasks.

Prompt: “Describe the process of developing a Foundation Model in AI,
from data collection to model training.”

Mistral Response:
Developing a foundation model in AI is a complex process that involves
several steps, including data collection, preprocessing, labeling, and
training. Here is an overview of the general process:

1. Data Collection: The first step in developing a foundation model is to
collect large amounts of data. This data can come from various sources,
such as text documents, images, videos, or audio recordings. The quality
and quantity of the data collected will have a significant impact on the
performance of the final model.

2. Data Preprocessing: Once the data has been collected, it needs to be
preprocessed to make it suitable for use in a foundation model. This may
involve cleaning the data (removing noise or irrelevant information),
converting it into a specific format (such as text or images), and
normalizing it to ensure that all the data is on the same scale.

3. Labeling: The next step is to label the data, which involves assigning
a category or label to each item in the dataset. This labeling can be done
manually by humans, or automatically using various algorithms and
techniques. Accurate labeling is critical for training a foundation model,
as it helps the model learn to recognize patterns and make predictions
based on the labeled data.

4. Model Training: Once the data has been collected, preprocessed, and
labeled, the next step is to train the foundation model using supervised
learning algorithms. The model is fed large amounts of training data, and
it learns to predict the category or label of each item in the dataset
based on the input features. The model is trained over many iterations,
with the goal of improving its accuracy and performance on the task at
hand.

5. Fine-tuning: After the foundation model has been trained on a large
amount of data, it can be fine-tuned for specific applications or tasks.
This may involve adjusting the model's architecture or hyperparameters, or
adding additional layers or components to improve its performance on a
particular task.
6. Deployment: Once the foundation model has been developed and trained,
it can be deployed in various applications or systems. The model may be
used to generate text or images, classify objects or events, or perform
other tasks depending on its design and training data.

Overall, developing a foundation model is a complex process that requires
expertise in various areas of AI, including data collection,
preprocessing, labeling, and model training. It also requires access to
large amounts of high-quality data and powerful computing resources to
train the model effectively.

5. Iterative Prompting: This is a process where you refine your prompt based on the outputs you get, slowly guiding the AI to the desired answer or style of answer.

Initial Prompt: “Tell me about the latest developments in Foundation Models in AI.”

Response: 
Foundation models are a rapidly evolving area of AI research, with many
exciting developments and advancements being made in recent years. Here
are some of the latest developments in foundation models:

1. Transformers: The transformer architecture is one of the most popular
architectures used in foundation models. It has shown impressive
performance on various NLP tasks, including language modeling, question
answering, and text classification. Recent advances in transformer
architecture have focused on improving its efficiency and scalability,
allowing for larger and more complex models to be trained on massive
amounts of data.
2. Vision Models: Foundation models are also being used to develop models
for computer vision tasks such as image classification, object detection,
and segmentation. These models often use transformer architectures or
other deep learning techniques to process large amounts of image data and
make predictions about the visual content of an image.
3. Multimodal Models: Foundation models are also being developed to handle
multimodal data, which combines information from multiple sensory
modalities such as text, images, audio, or video. These models can be used
for a variety of applications, including natural language processing with
visual or auditory inputs, and vice versa.
4. Reinforcement Learning: Foundation models are also being applied to
reinforcement learning tasks, which involve training an agent to make
sequential decisions in an environment. Recent developments have focused
on using foundation models to improve the scalability and efficiency of
reinforcement learning algorithms, allowing for larger and more complex
environments to be trained.
5. Transfer Learning: Transfer learning is another area where foundation
models are making significant progress. Transfer learning involves using a
pre-trained model as a starting point for a new task, and fine-tuning the
model on a smaller dataset specific to that task. Recent advances in
transfer learning have focused on developing more effective and efficient
methods for adapting pre-trained models to new tasks.

Overall, foundation models are an exciting area of AI research, with many
promising developments and applications across various domains. As
technology continues to advance, we can expect to see even more innovative
uses of foundation models in the future.

Refined Prompt: “Can you provide more details about these improvements in multi-modal learning within Foundation Models?”


Refined Response:
Multi-modal learning is a subfield of AI that involves processing data
from multiple modalities, such as text, images, audio, and video.
Foundation models are being used to develop more effective and efficient
methods for multi-modal learning, enabling machines to understand and
interact with the world in a more natural and intuitive way.

Here are some specific improvements in multi-modal learning within
foundation models:

1. Multi-Modal Transformers: The transformer architecture has been
extended to handle multi-modal data, allowing for more complex and nuanced
interactions between different modalities. This has enabled the
development of models that can process text and images simultaneously, or
that can process audio and video inputs in real-time.
2. Vision-Language Models: Another area where foundation models are being
used for multi-modal learning is vision-language modeling. These models
can process both visual and textual data to understand the relationship
between images and language, enabling applications such as image
captioning, question answering, and sentiment analysis.
3. Multi-Modal Object Detection: Foundation models are also being used to
develop more accurate and efficient methods for object detection in
multi-modal data. These models can process both visual and textual data to
identify objects and their relationships, enabling applications such as
search and retrieval, recommendation systems, and natural language
processing.
4. Multi-Modal Speech Recognition: Speech recognition is another area
where foundation models are being used for multi-modal learning. These
models can process audio and visual inputs simultaneously to improve the
accuracy and robustness of speech recognition systems, enabling
applications such as voice assistants and call center automation.
5. Multi-Modal Sentiment Analysis: Foundation models are also being used
to develop more accurate and efficient methods for sentiment analysis in
multi-modal data. These models can process both visual and textual data to
understand the emotional content of a scene or a piece of text, enabling
applications such as social media monitoring and customer service.

Overall, these improvements in multi-modal learning within foundation
models have the potential to revolutionize the way machines interact with
the world, allowing for more natural and intuitive communication between
humans and machines. As technology continues to advance, we can expect to
see even more innovative uses of foundation models in the field of
multi-modal learning.

6. Negative Prompting: In this method, you tell the AI what not to do. For instance, you might specify that you don’t want a certain type of content in the response.

Prompt: “Explain the concept of Foundation Models in AI without mentioning natural language processing or NLP.”

Mistral Response:

Foundation models in AI are a type of machine learning model that is
designed to be versatile and reusable. They are typically pre-trained on
large amounts of data from multiple domains, enabling them to perform well
on a wide range of tasks without the need for extensive fine-tuning. The
idea behind foundation models is to build a general-purpose model that can
handle a variety of tasks, rather than building a separate model for each
specific task.

Foundation models are often based on transformer architectures, which have
shown great success in natural language processing tasks like GPT-3.
However, the same principles can be applied to other domains of AI, such
as computer vision or robotics. For example, a foundation model for
computer vision might be pre-trained on a large dataset of images from
multiple categories, enabling it to perform well on image classification,
object detection, and segmentation tasks without the need for extensive
fine-tuning.

One key advantage of foundation models is their ability to handle complex
and multi-modal data, which can include information from multiple
modalities like text, images, audio, and video. This enables machines to
understand and interact with the world in a more natural and intuitive
way, allowing for applications such as autonomous vehicles, medical
imaging, and customer service.

Another advantage of foundation models is their ability to be fine-tuned
on smaller datasets specific to a particular task, enabling them to
perform even better on that specific task without the need for extensive
data collection or labeling. This makes foundation models particularly
useful in scenarios where labeled data is scarce or expensive to obtain.

Overall, foundation models are an exciting area of AI research that has
the potential to revolutionize the way machines interact with the world.
As technology continues to advance, we can expect to see even more
innovative uses of foundation models in a variety of domains of AI.

NOTE: In this case, Mistral clearly fails and it refers to Natural Language Processing. I have tested the same prompt on GPT, and the results are more satisfying.

GPT Response: 

Foundation models in AI refer to large-scale machine learning models that are
trained on extensive and diverse datasets. These models serve as a basic
structure or foundation, upon which various specialized applications can be
built. The concept revolves around the idea that a single model, due to its
vast and diverse training, can be adapted or fine-tuned for a multitude of
tasks, even those it wasn't specifically trained for.

Key characteristics of foundation models include:

1. Large-Scale Training Data: They are trained on massive datasets, often
comprising a wide range of information. This extensive training helps the model
develop a broad understanding of various patterns and relationships within the
data.

2. Generalization and Adaptability: Foundation models are designed to
generalize across tasks. This means they can perform well on a variety of
tasks, even those not seen during training. This adaptability is crucial for
applying the model to specialized tasks with minimal additional training.

3. Transfer Learning: These models enable transfer learning, where knowledge
gained from the initial training can be transferred to new tasks. This process
often requires only a small amount of data specific to the new task.

4. Multimodal Capabilities: Many foundation models can process and
understand different types of data, such as text, images, and possibly even
audio or other sensor data. This multimodal capability allows for applications
in diverse fields like computer vision, audio processing, and beyond.

5. Resource-Intensive: Training and deploying foundation models often
require significant computational resources due to their size and complexity.

6. Ethical and Bias Considerations: Given their broad training and
application scope, ethical considerations and potential biases inherent in the
training data are important concerns in their development and deployment.

Foundation models have been instrumental in advancing AI by providing a
versatile, adaptable, and powerful base from which specialized applications can
be developed across various domains.

7. Hybrid Prompting: Combining different methods, like few-shot with chain-of-thought, to get more precise or creative outputs.

Prompt: “Like GPT-3, which is a versatile model used in various language tasks, explain how Foundation Models are applied in other domains of AI, such as computer vision.”

Mistral Response:

Foundation Models are a type of AI model that is trained on a large amount
of data to generate high-quality text. However, they can also be applied
in other domains of AI such as computer vision. In computer vision,
Foundation Models can be used to perform tasks such as image
classification, object detection, and segmentation.
One way this is done is by using pre-trained Vision Transformers, which
are a type of Foundation Model specifically designed for computer vision
tasks. These models have been trained on massive amounts of image data and
can be fine-tuned to perform specific tasks such as identifying objects in
an image or classifying the object into different categories.
Another way Foundation Models can be applied in computer vision is by
using a technique called "transfer learning". This involves taking a
pre-trained model that was trained on one task and applying it to a
related task. For example, a model that was trained to recognize faces in
images could be fine-tuned to recognize different objects or even people
in different poses.
Overall, Foundation Models have the potential to revolutionize computer
vision by providing powerful models that can perform a wide range of tasks
with high accuracy and efficiency.

8. Prompt Chaining: Breaking down a complex task into smaller prompts and then chaining the outputs together to form a final response.

First Prompt: “List some examples of Foundation Models in AI.”

Mistral response:

Here are some examples of Foundation Models in AI:
1. GPT-3: A language model developed by OpenAI that can perform a wide
range of natural language processing tasks, including text generation,
question answering, and summarization.
2. BERT: A language model developed by Google that has been used for tasks
such as sentiment analysis, named entity recognition, and machine
translation.
3. Vision Transformers: A type of Foundation Model specifically designed
for computer vision tasks such as image classification, object detection,
and segmentation.
4. Autoencoder: A type of neural network that can be trained to
reconstruct its input data, which has been used for tasks such as
denoising, compression, and anomaly detection.
5. Variational Autoencoder (VAE): A type of neural network that can be
trained to generate new data samples that are similar to a given dataset,
which has been used for tasks such as image generation and dimensionality
reduction.

Second Prompt: “Choose one of these models and explain its foundational role in AI development.”

Mistral Response:

One of the most foundational models in AI development is GPT-3. Developed
by OpenAI, GPT-3 is a language model that has been used for a wide range
of natural language processing tasks such as text generation, question
answering, and summarization.
GPT-3's foundational role in AI development comes from its ability to
process and generate natural language at scale. The model was trained on
an enormous amount of text data, allowing it to understand the nuances and
complexities of human language. This has made it a powerful tool for tasks
such as chatbots, where it can understand and respond to user queries in a
natural way.
GPT-3's impact on AI development has been significant, as it has
demonstrated the potential for large language models to perform complex
tasks with high accuracy. This has led to increased interest in developing
other large language models, such as BERT and RoBERTa, which have also
shown great success in natural language processing tasks.
Overall, GPT-3's foundational role in AI development comes from its
ability to process and generate natural language at scale, making it a
powerful tool for a wide range of AI applications.

While prompt engineering should be most suitable for interacting with small models, it is also the most difficult task to achieve. Small models like Mistral have the tendency to not follow the prompt. In the example of “Negative Prompting”, GPT 4 manages to follow the instruction, while Mistral failed to complete the task of describing “Foundation Models without mentioning NLP”.

One useful followup reading would be “Prompting Frameworks for Large Language Models: A Survey” by Xiaoxia Liu et al.
https://doi.org/10.48550/arXiv.2311.12785

--

--