LLM Bootcamp Notes— Part 1: Prompt Engineering

4 min readSep 16, 2023

The LLM Bootcamp series by FullStackDeepLearning offers great insights into the world of Generative AI by taking a very structured approach to the topic. The series goes from introducing LLMs all the way upto approaches and recommendation for production grade LLMs & Generative AI solutions.

I have consolidated my learning notes in a series of 4 articles, distilling to the 4 core focus areas for LLMs from the Bootcamp series — This part 1 article is on Prompt Engineering

What is prompt Engineering?

Prompt Engineering is the art of designing the text that goes into the LLM. It can also be thought as a way to program these LLMs to do what it is instructed by the user

Prompts are Magic Spells

On a very high level, LLMs are statistical models of the data they are trained on. They act as auto-regressive models that predict on itself - For example, predicting the next word in a sentence. When thinking of generic statistical models, a pattern matcher can be considered as one. However, using a statistical pattern matcher gives very bad intuitions and next word recommendations.

Probabilistic programs are comparatively better that the statistical models, in that they can put some thought before answering a question. The thought is assigned a probability and then gives the output as the one with highest probability. However, these type of programs are arcane now.

An LLM in reality can be considered as a probabilistic model. This probabilistic model of text that has access to the base reference documents or in general, a database from which the results of these LLMs are generated. Here, prompting is a way of applying a weight to these documents to condition the model to weight the most relevant documents to base the results.

Prompting can also be thought of as a subtractive technique — That is each prompt narrows down the list of possibilities for the result by subtracting or weighting the important documents with a higher weightage and less important documents with a lesser weight accordingly.

Instruction tuning is also a way of asking the model to perform a few tasks and answer the question. This involves the model being given a specific instruction on behaviour, response style etc. It can range anything from “being fair, unbiased, to being funny”

There are a few rules that can be followed to extract the best value from prompts — The genie’s rules are:

Use low-level patterns: Instead of using instructions that require further explanation, use low-level patterns to give instructions. Eg: Create a question paper for etc etc topic — -> Use phrases like what, why, where, in regards to the context.
Itemising Instructions: Turn descriptive attributes into bulleted lists. Also, If there are negation statements, turn it into assertion statements. Eg: Don’t be biased — -> Be unbiased

Limitations:

Simulating something that does not exist may give the best results. Eg: Simulate a super intelligent AI — this is because the model may not have a ground reference for the simulation task at hand
If the LLM is ask to simulate a human thinking for a few seconds or a Reddit mediator, then they are good at it. But, simulating human thinking flow for hours or simulating a python kernel to compile program, then they may not be good at it — in such cases using a purpose built tool for the task might be better than using an LLM.

PROMPTING TECHNIQUES

Things to watch out for:

Few shot learning might be bad idea in most cases — a well done zero shot prompt can match the effect of multiple examples
tokenisation can be tricky
Models struggle to move away from their base training — i.e if an opposite example is given, the model ignores the example and might fall back to their training
Models don’t see words,, they only see tokens. Therefore even gibberish text inputs can sometimes give results.

The prompting playbook:

Operate on structured text — this gives the model easier access on the data
Automate the process of asking follow-up texts by self-ask examples — Eg. Asking the model to be self-critical of the answer it has provided
Reasoning by few-shot prompting with Chain-of-thought — Eg.
Alternatively, reasoning by “just asking for it” — Eg. Giving a reasoning followed by the instruction “think about it step by step”
Zero-Shot Chain-of-thought — self criticism not only asks the model to be critical but also just asks the models to fix its answer
Use Ensembling technique — Eg. take output from 50 different models for the same question and do a majority voting on the quality of the answers

The cost factor

All of these techniques either come at the cost of latency or the cost of compute. A simple thumb rule is that — the more the requests to the model, the more the cost & latency. Therefore techniques like self-criticism and emsembling are going to rack up the costs. Be mindful of the choices to keep the cost under control

Footnotes

To read the part 2 article of this series — Click here

To read the part 3 article of this series — Click here

To read the part 4 article of this series — Click here

Note — The article is a distilled consolidation of my understanding of the topic. If you find any conceptual errors, please leave a feedback so that I can fix it. Cheers!

References:

https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/prompt-engineering/

LLM Bootcamp Notes— Part 1: Prompt Engineering

Written by Anirudh Gokulaprasad