Prompt Engineering 101

6 min readJul 7, 2024

Prompting is a crucial aspect of prototyping new applications. The right prompting techniques, when used correctly, can significantly enhance the development process, often more than we realize. However, it’s important to understand that even prompt-based applications require substantial engineering around the prompt to function effectively.

In this tutorial, we’re utilizing the Llama model, specifically the version “llama3,” for our language model needs. LlamaIndex makes it easy to import and utilize advanced language models like Llama. Here’s how you can quickly set up and start using the Llama model in your project:

from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3", request_timeout=120.0)

Intro to Prompting

There are many prompt templates already provided by LLamaIndex which can help you get better output from the language model.

from llama_index.core.llms import ChatMessage

#define the prompt
prompt = "The task is to correctly use a new word in a sentence. For example : A 'whatpu' is a small, furry animal native to Tanzania. An example of a sentence that uses the word whatpu is: \
We were traveling in Africa and we saw these very cute whatpus."

#pass in the question
sample_question = "To do a 'farduddle' means to jump up and down really fast."


#build 
messages = [
    ChatMessage(
        role = "system", 
        content = prompt
    ),
    ChatMessage(role="user", content = sample_question),
]

#call the model
resp = llm.chat(messages)

#see the output
resp.message.content
OUTPUT : "An interesting activity! Here's an example sentence:

The kids loved doing a farduddle at the party, getting everyone else laughing and joining in on the fun."

In-Context Learning

The idea of in-context learning via N-shot prompts is to provide the language model with examples that demonstrate the task and align outputs to our expectations.

Tips for Effective N-shot Prompts:

Keep N Up to 10–15: Using 10 to 15 examples typically balances providing enough context.
Use Relevant Examples: Ensure the examples are closely related to the task.
Include Desired Output: Sometimes, showing the model an example of the desired output can be sufficient to guide its responses accurately.

prompt = "The task is to correctly tell wether sentiment of sentence is negative or positive. For example : The movie was pretty bad. \
Sentiment : Negative \
Hoestly the best acting I have seen ever! \
Sentimen : Positive"

sample_question = "Pretty boring story, script not upto the expectations"

messages = [
    ChatMessage(
        role = "system", 
        content = prompt
    ),
    ChatMessage(role="user", content = sample_question),
]

resp = llm.chat(messages)

resp.message.content
OUTPUT : 'Sentiment: Negative'

Chain of Thoughts

The idea of chain-of-thoughts prompting is to encourage the language model to explain its thought process before returning the final answer. By adding specificity, this method often significantly reduces hallucination rates and enhances the accuracy of the responses.

This method enables complex reasoning capabilities by guiding the model through intermediate reasoning steps, resulting in more coherent and reliable outputs.

prompt = "Let's think step by step."

sample_question = "I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. \
I then went and bought 5 more apples and ate 1. How many apples did I remain with?"

messages = [
    ChatMessage(
        role = "system", 
        content = prompt
    ),
    ChatMessage(role="user", content = sample_question),
]

resp = llm.chat(messages)

resp.message.content

OUTPUT: "Let's break it down step by step!
Initially, you had 10 apples.

You gave 2 apples to the neighbor, so you're left with:
10 - 2 = 8 apples

Then, you gave 2 apples to the repairman, so you're left with:
8 - 2 = 6 apples

After that, you bought 5 more apples, so you now have:
6 + 5 = 11 apples

Finally, you ate 1 apple, leaving you with:
11 - 1 = 10 apples

So, you remained with 10 apples."

Prompt Chaining

One of the important prompt engineering techniques is to break tasks into their subtasks. This is known as prompt chaining, where a task is split into a series of subtasks, creating a chain of prompt operations. In prompt chaining, each chain prompt performs transformations or additional processes on the generated responses before reaching the final desired state.

Prompt chaining helps boost the transparency of LLM application, increases controllability, and enhances reliability. This approach allows you to debug problems with model responses more easily and analyze and improve performance in the different stages that need improvement.

Prompt chaining is widely used for Document QA, enabling the model to process and refine information step-by-step, ensuring more accurate and reliable answers.

Let's define a pseudo document that describes what prompt engineering is. Also see the first line of the prompt message —

prompt = "You are a helpful assistant. Your task is to help answer a question given in a document. The first step is to extract quotes relevant to the question from the document, delimited by ####. Please output the list of quotes using <quotes></quotes>. Respond with 'No relevant quotes found!' if no relevant quotes were found. \
#### \
Prompt engineering is the process of structuring an instruction that can be interpreted and understood by a generative AI model. \
A prompt is natural language text describing the task that an AI should perform. \
A prompt for a text-to-text language model can be a query such as what is Fermat's little theorem? \
a command such as write a poem about leaves falling, or a longer statement including context, instructions, \
and conversation history. Prompt engineering may involve phrasing a query, specifying a style, providing \
relevant context or assigning a role to the AI such as Act as a native French speaker. A prompt may include a \
few examples for a model to learn from, such as asking the model to complete maison - house, chat - cat, chien -\
(the expected response being dog), an approach called few-shot learning. \
When communicating with a text-to-image or a text-to-audio model, a typical prompt is a description of a desired \
output such as a high-quality photo of an astronaut riding a horse or Lo-fi slow BPM electro chill \
with organic samples. Prompting a text-to-image model may involve adding, removing, emphasizing and re-ordering \
words to achieve a desired subject, style, layout, lighting, and aesthetic. \
#### "

sample_question = "What are prompting techniques mentioned in the document?"

messages = [
    ChatMessage(
        role = "system", 
        content = prompt
    ),
    ChatMessage(role="user", content = sample_question),
]

resp = llm.chat(messages)

resp.message.content
OUTPUT:
'<quotes>Phrasing a query, specifying a style, providing relevant context or assigning a role to the AI such as Act as a native French speaker. An approach called few-shot learning involves including a few examples for a model to learn from.</quotes>

These are the prompting techniques mentioned in the document.'

Our language model correctly identifies and extracts information from the document. See that it does not provide any external knowledge. Now let's see the case when we don't provide information in the prompt or document and the language model about it.

sample_question = "What are fine tuning for LLMs techniques mentioned in the document?"

messages = [
    ChatMessage(
        role = "system", 
        content = prompt
    ),
    ChatMessage(role="user", content = sample_question),
]

resp = llm.chat(messages)

resp.message.content
OUTPUT: "<quotes>

No relevant quotes found!

</quotes>

There are no specific fine-tuning techniques mentioned in the provided document. The text primarily discusses prompt engineering and its applications, rather than focusing on fine-tuning techniques for Large Language Models (LLMs). If you're looking for information on fine-tuning techniques, I'd be happy to help with that!"

Our model can handle this case very well and does not provide any external knowledge that is not present in the document(prompt). It shows how effective prompt chaining is.

Tree of Thoughts

The Tree of Thoughts is a framework that generalizes over chain-of-thought prompting and encourages exploration of thoughts that serve as intermediate steps for general problem-solving with language models. This method enables an LLM to self-evaluate its progress through intermediate thoughts made towards solving a problem via a reasoning process.

This allows the model to explore multiple pathways, assess potential outcomes, and refine its reasoning to reach more accurate and well-founded conclusions.

Next Steps

The next step is to build a Retrieval-Augmented Generation (RAG) model. A RAG model integrates these techniques to further enhance the capabilities and performance of language models.

Whether you are summarizing documents, answering complex queries, or developing sophisticated AI applications, these techniques will enhance your model’s performance, reliability, and efficiency.

Further Reading —