Maximising your LLM’s Potential with Prompt Engineering
Prompt Engineering can be thought of as the art and science of crafting effective questions or instructions to elicit desired responses from language models. Think of it as asking the right questions to a vast library with an ultra-efficient librarian. You want to get precise, concise, and useful answers; how you phrase the question can make all the difference.
I. In-context Learning
Large language models (LLM) like ChatGPT learn from the context provided in prompts. Depending on the context, we have: zero-shot prompts, one-shot prompts, and few-shot prompts.
1. Zero-shot Prompts
TL;DR — Zero-shot prompting is where you don’t provide any example and expect the model to infer the answer. It’s like asking a chef to make a dish they’ve never heard of without showing them a recipe.
You are essentially asking the model to complete a task it hasn’t seen in the form of an example during your interaction. However, because of the diverse training data, the model might have seen similar tasks during training and could provide reasonable results. Large models would be pretty great at this and could give you a really decent answer, but smaller models might not be as good.
Example prompt:
Provide a description for an IT project that develops a mobile banking application.
Model’s response:
Project Mercury: A next-generation mobile banking application that offers users seamless transactions, real-time account monitoring, and cutting-edge security features. The platform will be designed for both iOS and Android, ensuring broad accessibility. Integration with major banks and financial institutions is paramount, and the user interface will be crafted for simplicity and ease of use.
The model did quite well with just a short prompt without examples (it did invent/hallucinate by randomly assign the name “Mercury” to the project though). Zero-shot may sometimes produce more varied responses since it’s not anchored to a given example. This can be both an advantage (more creativity) or a disadvantage (less predictability). If you enter the same prompt again, another response might come up.
2. One-shot Prompts
TL;DR — One-shot prompting is where you give one example to guide the model. Think of this as humming a tune once to a musician and expecting them to play the rest.
This would work well with smaller models, because you’ve given it an example to follow. Some advantages might include: providing more context to the model than zero-shot prompting, guiding the model to generate outputs in a specific format or style. However, this also means the model’d rely heavily on the quality and clarity of the provided examples.
Example prompt:
Provide a description for an IT project called Apollo that develops a mobile budgeting application
Project Apollo:Overview:
Project Apollo sets out to create a state-of-the-art mobile budgeting application, destined to empower users to take control of their finances like never before.Objective:
To deliver a user-centric, intuitive, and feature-rich budgeting tool that enables individuals to track, manage, and optimise their finances with ease, offering clarity and foresight in their financial journey.Features:
1. Cross-Platform Compatibility: Crafted for both iOS and Android, ensuring a consistent and tailored user experience.
2. Expense Tracking: Real-time tracking of expenditures, categorised by type (e.g., groceries, entertainment, utilities).
3. Income Management: Log multiple income sources, including salaries, freelance payments, and other revenue streams.
4. Financial Goals Setting: Enables users to set short-term and long-term financial goals and monitor their progress.
5. Budget Forecasts: Uses historical data to project future spending habits and offers recommendations for savings.
6. Automated Alerts: Customisable notifications for upcoming bills, goal achievements, and when nearing set budget limits.
7. Visual Insights: Graphical representations of spending habits, monthly comparisons, and trend analysis.
8. Integration with Banks: Directly link bank accounts for real-time updates and balance checks.
9. Data Security: Incorporates end-to-end encryption and biometric access controls to ensure user data is protected.
….Provide a description for an IT project called Artemis that develops a mobile banking application.
Model’s response:
Project Artemis:
Overview:
Project Artemis is envisioned to be a cutting-edge mobile banking solution designed to provide users with a secure, efficient, and comprehensive platform to manage all their banking needs on the go.Objective:
To innovate the realm of digital banking, Artemis aims to offer a seamless and user-friendly experience, merging traditional banking functionalities with modern-day features, all while ensuring top-tier security measures.Features:
1. Multi-Device Adaptability: Native applications designed for both iOS and Android, ensuring a holistic banking experience across devices.
2. Account Overview: Instant access to account balances, savings, loans, and investment portfolios.
3. Instant Transfers: Quick and secure fund transfers to both domestic and international accounts.
4. Loan and Credit Management: Track existing loans, apply for new ones, and manage credit card functionalities like bill payment and limit checks.
5. Digital Wallet: Incorporate a built-in wallet for cashless transactions, QR code payments, and contactless NFC payments.
6. Savings and Investment: Tools to invest in mutual funds, buy insurance, or set automated savings goals.
7. Bill Payment Station: A consolidated section to manage and pay utility bills, subscriptions, and more.
8. User Personalisation: Customise interface themes, set favourite transactions, and personalise notification settings.
9. Chatbot and Support: 24/7 in-app chatbot for quick query resolution and dedicated customer support for intricate issues.
10. Top-Tier Security: Multi-layered security protocols including biometrics, OTPs, and end-to-end encryption to keep user data and finances safe.
As you can see, the model was able to produce a very detailed response following the exact format provided.
3. Few-shot Prompts
TL;DR — Few-shot prompting is where multiple examples are given. It’s akin to showing multiple sketches to an artist before they paint the final picture.
This is like the extended version of one-shot prompting. Few-shot would be particularly even more beneficial if you’re working with a really small model which would need a lot more guidance compared to large models being trained on more vast datasets.
Example prompt:
Data: Tech company earnings beat expectations by 15%, unemployment rate drops by 0.3%. Prediction: Bullish.
Data: Federal Reserve increases interest rates by 2%, global oil prices surge by 10%. Prediction: Bearish.
Data: Consumer spending remains stable, no significant geopolitical events, minor fluctuations in tech stocks. Prediction: Neutral.
Data: Major tech firm faces antitrust lawsuit, GDP growth rate is steady at 2%, minor uptick in manufacturing sector. Prediction:
Model’s response:
Prediction: Mixed/Neutral.
It’s worth noting that the context window/input usually has a token limit so the amount of examples you pass into the model is also limited. A general rule of thumb is that if your model still doesn’t do very well when given 5 or 6 examples, then you should fine-tune that model further.
II. Prompt Guidelines
Crafting the right prompt can be tricky. However, here are some tried-and-true guidelines to make your prompts more effective:
1. Use Delimiters
Using delimiters in prompts can be beneficial in situations where you want to structure your input in a way that is clear and easily parsed by an LLM, especially for models like GPT-4 which rely on textual cues. It helps the model to differentiate the actual input versus the prompt/instruction, avoiding mistaking the input content as an instruction, or vice versa.
Example prompt:
Given the paragraph delimited by triple quotes, identify all questions in it
“““
The Australian real estate market has witnessed a roller-coaster of dynamics over the past few decades. Major cities like Sydney and Melbourne have seen skyrocketing house prices, creating barriers for first-time homebuyers. Factors such as foreign investment, a strong economy, and population growth have driven demand. However, in recent years, concerns about housing affordability have been at the forefront of political debates. With increasing urbanization, is there a potential for market saturation in metropolitan areas? Moreover, how will the market respond to changing climate patterns, given Australia’s susceptibility to bushfires and floods? The intersection of environmental factors and housing policies presents a unique challenge for the Australian property market.
“““
Model’s response:
From the provided paragraph, the identified questions are:
1. With increasing urbanization, is there a potential for market saturation in metropolitan areas?
2. How will the market respond to changing climate patterns, given Australia’s susceptibility to bushfires and floods?
2. Specify Output Format
If you need the response in a table, markdown, JSON, or any other specific format, mention it explicitly. It’s similar to requesting a dish in a restaurant — be specific about your dietary preferences!
Example prompt:
Given the following content delimited by triple backticks, convert it to a table in markdown format where each attribute is a column
```
Name: John Doe | Age: 30 | Occupation: Engineer | Location: New York
```
Model’s response:
| Name | Age | Occupation | Location |
|-----------|-----|------------|-----------|
| John Doe | 30 | Engineer | New York |
3. Specify Steps
If you’re asking the model to solve a problem, break it down step by step. Like instructing someone on assembling furniture — the clearer the steps, the better the outcome.
Example prompt:
Given the following paragraph delimited by triple quotes
“““
The Australian real estate market has witnessed a roller-coaster of dynamics over the past few decades. Major cities like Sydney and Melbourne have seen skyrocketing house prices, creating barriers for first-time homebuyers. Factors such as foreign investment, a strong economy, and population growth have driven demand. However, in recent years, concerns about housing affordability have been at the forefront of political debates. With increasing urbanization, is there a potential for market saturation in metropolitan areas? Moreover, how will the market respond to changing climate patterns, given Australia’s susceptibility to bushfires and floods? The intersection of environmental factors and housing policies presents a unique challenge for the Australian property market.
“““Perform the following actions
- Summarise the paragraph into 1 sentence
- Translate the summary to Vietnamese
- Identify named entities in the Vietnamese summary
- Output a markdown table to show those named entities
Model’s response
1. Summarise the paragraph into 1 sentence:
The Australian real estate market, influenced by factors like foreign investment and urbanization, faces challenges from rising house prices, concerns over housing affordability, and environmental factors like bushfires and floods.
2. Translate the summary to Vietnamese:
Thị trường bất động sản Úc, ảnh hưởng bởi các yếu tố như đầu tư nước ngoài và đô thị hóa, đối mặt với những thách thức từ việc tăng giá nhà, lo ngại về khả năng mua nhà và các yếu tố môi trường như cháy rừng và lụt lội.
3. Identify named entities in the Vietnamese summary:
Úc (Australia)
4. Output a markdown table to show those named entities:
| Named Entities. | Translation |
| — — — — — — — — | — — — — — — -|
| Úc | Australia |
4. Reason Before Conclusions
Instead of simply asking for an answer, request the model to reason it out or break down the steps it has taken to work out the answer. This is particularly useful if you ask the model a math question or a coding question.
Example prompt (content sourced from the ATO — Actual Cost Method):
Given an example about calculating working from home expenses for tax deductions from the ATO delimited in triple backticks
```
Ben works at home several days per week and keeps a record of the total hours he works from home. His record shows he worked a total of 768 hours from home in 2022–23.When he works from home, Ben sits in a separate room of his house and always uses the air conditioner in the room when he is working. His air conditioning unit is a small with a 3.5 kilowatt (kw) capacity. Based on the unit’s energy efficiency rating, the unit costs Ben 1.09 kw per hour to run.
Based on his electricity bills, Ben pays 27.81c per kilowatt hour (28 c rounded up) for electricity.
Ben calculates the cost of cooling and heating for the room he uses when he is working from home as:
1.09 kw per hour × 0.28 per hour = 30.52c per kw hour
768 hours × 30.52 c = $234.39.
```Perform the following actions:
- Work out the deduction amount on your own
- Compare with the results given and explain the differences if any
Model’s response:
1. Work out the deduction amount on my own:
Given:
- Hours worked from home: 768 hours
- Cost of running the air conditioner: 1.09 kw per hour
- Cost per kilowatt hour: 27.81c (or 0.2781 dollars)Deduction amount for using the air conditioner:
Hours × Cost per kw × Cost per kilowatt hour
= 768 hours × 1.09 kw/hour × 0.2781 dollars/kw
= 232.3563 dollars.
2. Compare with the results given:
Given calculation:
1.09 kw per hour × 0.28 per hour = 30.52c per kw hour
768 hours × 30.52c = $234.39.The calculated deduction based on the information given is $234.39.
Comparison:
My calculation came out to be $232.36, while the example provided came out to be $234.39.Explanation of Differences:
The difference arises from the rounding of the cost per kilowatt hour. In the example, it’s rounded to 28c (or 0.28 dollars) per kw hour, while I used the exact value of 27.81c (or 0.2781 dollars) per kw hour. This minor difference in cost per kilowatt hour affects the final deduction amount when multiplied by the total hours and power consumption of the air conditioner.
III. Iterative Prompt Development
Like any craft, prompt engineering improves with practice and iterations. Start with a base prompt, test it, refine based on the output, and test again. It’s the classic potter’s wheel analogy: you start with raw clay (the initial prompt), and through continuous refining, you shape it into a beautiful pot (desired output).
- The first step in the journey involves drafting a prompt that instructs the model. This could be as simple as asking a direct question, or it might involve more nuanced framing to guide the model’s output in a particular direction.
- Once the prompt is entered, it’s essential to see how the model interacts with it. Does the model understand the intent? Is the response verbose or succinct? Does it capture the nuances the developer intended?
- After that, refine the prompt based on observed outputs. This might involve adding clarifications, using different phrasings, or even restructuring the prompt entirely.
- The refined prompt is then fed back to the model, and the outputs are evaluated again. This cycle might be repeated several times until the desired results are consistently achieved.
IV. Inference Configuration Parameters
Beyond prompts, inference configuration is a set of parameters that can be adjusted to modify the output of LLM. Each parameter influences a different aspect of the generation process, impacting the length, diversity, and specificity of the generated content. Note that these are parameters that influence the models during inference — different from the training parameters which are learnt during training.
- Max New Tokens: This parameter sets a limit on the number of new tokens (words, characters, or subwords, depending on the model’s tokenisation scheme) the model can produce in its response. If set too low, the model might not provide a comprehensive answer. If set too high, you might receive excessively verbose outputs.
- Temperature: A high value makes output more random; a low value makes it deterministic. Imagine adjusting the temperature when brewing tea — a slight change can lead to a different flavour profile. For a creative writing prompt, a higher temperature might be used to generate diverse and novel sentences. For a factual question, a lower temperature might be preferred for a straightforward answer.
- Sample Top K: During text generation, the model ranks the next possible tokens based on their likelihood. Top K restricts the model to only consider the top K most probable next tokens. A lower value (e.g., 40) can focus the model’s output but may also introduce repetition. A higher value (e.g., 100) or not using Top K can increase diversity but might introduce irrelevant tokens.
- Sample Top P: Instead of just considering the top K probable tokens, Top P chooses predictions whose combined probabilities do not exceed P. For open-ended questions where a variety of relevant answers may be suitable, setting Top P (e.g., 0.9) can be useful.
In practice, these parameters often need to be adjusted based on the specific task and desired output characteristics. Fine-tuning these configurations can help in obtaining the optimal response from a model for a given application.
References:
Until next time. Happy reading!