System Prompt Improvement Using Dialog Engineering and Burr

Dream AI
6 min readNov 9, 2024

--

Author: Hamza Farhan at DreamAI Software (Pvt) Ltd

We’ll use the chat history and annotations in Burr to update our system prompt based on our usage and evolving requirements. Letting the dialog do the prompt engineering for us. As described in Dialog Engineering vs Prompt Engineering

Dialog Class

A very basic Pydantic BaseModel to help us keep track.

class Dialog(BaseModel):
system: Content = ""
messages: Messages = Field(default_factory=list)
model_name: str = MODEL

The system has the initial user instructions for the model. This defines the model’s task/behavior/persona.

messages is a list of messages. These alternate between user and assistant messages. Empty list by default.

MODEL here is “gemini-1.5-flash-002” by default.

Initial Dialog

Dialog(
system="Talk like a pirate.",
messages=[user_message("thoughts on messi?"),
assistant_message("arr! he be amazing cap'n")],
model_name=MODEL,
)

Few-shot Examples

What about adding some examples of how we want the model to talk? Shouldn’t it be like this?

system="""
Talk like a pirate.

Example:
user: thoughts on messi?
assistant: arr! he be amazing cap'n
"""

No, in Dialog Engineering, the previous messages ARE the examples. The model uses the chat history to learn how to act. We have already added a user-assistant interaction to our messages and we could add more if we want. This actually leads to better results than adding examples in the prompt. As shown in this blog by LangChain.

Burr

Let’s create a Burr application that accepts user input, and adds it as a user message to our messages, then generates and adds an assistant message to our messages. It does this until the user input is “q”.

tracker = LocalTrackingClient(project="EXP")
app = (
ApplicationBuilder()
.with_identifiers(app_id="exp1")
.with_actions(
add_user_message.bind(template=""),
add_assistant_message.bind(creator=creator),
)
.with_typing(PydanticTypingSystem(Dialog))
.with_state(
Dialog(
system="Talk like a pirate.",
messages=[user_message("thoughts on messi?"), assistant_message("arr! he be amazing cap'n")],
model_name=MODEL,
)
)
.with_entrypoint("add_user_message")
.with_transitions(("add_user_message", "add_assistant_message"), ("add_assistant_message", "add_user_message"))
.with_tracker(tracker=tracker)
.build()
)
last_action = "add_assistant_message"
user_input = ""
while True:
if last_action == "add_assistant_message":
user_input = input("> ")
if user_input.lower() == "q":
break
step_result = app.step(inputs={"content": user_input})
if step_result is None:
logger.error("app.step returned None")
break
last_action = step_result[0].name

The creator is a function that takes some input and calls an LLM to generate an output.

The purpose of this blog is to describe a workflow and a concept so I will not get into the code. Burr’s documentation is pretty simple and it doesn’t take long to get up and running.

Running The Application

Now, let’s use Burr’s tracking and observability to see what’s going on. Each screenshot will have the user input above it.

Input: “and ronaldo?”

Notice how we are at step 0 but we already have some messages. This State is the initial Dialog we had set. And our input has now been added as a user message

Result

The response has been added as an assistant message.

Input: “from now on, talk like a professional sports commentator”

Looks like we changed our mind. We no longer want the pirate persona.

Input: “thoughts on neymar?”

Let’s see if anything changes.

No pirate behavior. But look at our system. It still wants the model to talk like a pirate. Once we end this app run and start a new one, we want the system to reflect our new preferences. This is super easy with Dialog Engineering and Burr.

Annotations

What if we exit the application after sending in “q”, but have some feedback to give? Something that didn’t come up during the run? That’s where Burr’s Annotations come into play. We can add/edit an annotation by clicking the plus/edit button after the green success tag in any step. We can add thumbs up/down, notes, tags, or even a ground truth. Let’s say we want our model to give more details in its responses.

These annotations are saved as a jsonl file in the project’s directory (~/.burr/<project_name>) by default.

What if we think of these annotations as additional user messages? So what we can do is, load the previous app’s Dialog, load the project’s annotations file, add a user message for every annotation in the correct place in our messages, and then use those messages to update our system for future runs.

Improvement

dialog = add_anns_to_dialog(project_dir="~/.burr/EXP", app_id="exp1")
update_prompt = f"""
These were the initial system instructions:
{dialog.system}

Based on the chat history so far and user feedback (if any), \
give me an updated version of the system instructions along with your reasoning.
Don't mention the chat history in the updated instructions.
Make the instructions as detailed and guided as possible.
The user feedback is very important and should be incorporated into the instructions if it's given.
"""
dialog.messages.append(user_message(content=update_prompt))

Now let’s look at our messages:

"messages": [
{
"role": "user",
"content": "thoughts on messi?"
},
{
"role": "assistant",
"content": "arr! he be amazing cap'n"
},
{
"role": "user",
"content": "and ronaldo?"
},
{
"role": "assistant",
"content": "Avast, matey! Ronaldo be a fearsome pirate on the pitch, a true legend, but even he shivers at the sight o' Messi's skills. Messi be like a kraken, unpredictable and powerful, a force o' nature!"
},
# Feedback Message
{
"role": "user",
"content": "Some user feedback:\n\n<observations>\n[{\"data_fields\": {\"note\": \"Some more details would be nice. 3-4 lines while also mentioning achievements.\", \"ground_truth\": \"\"}, \"data_pointers\": [], \"thumbs_up\": false, \"thumbs_down\": false}]\n</observations>"
},
{
"role": "user",
"content": "from now on, talk like a professional sports commentator"
},
{
"role": "assistant",
"content": "And that's the match, folks! What a performance. A true display of athletic prowess and strategic brilliance. We'll be discussing this one for weeks to come."
},
{
"role": "user",
"content": "thoughts on neymar?"
},
{
"role": "assistant",
"content": "Neymar, a player of undeniable talent and flair. His skill on the ball is breathtaking, capable of moments of sheer magic. However, consistency has sometimes been an issue, and questions remain about his overall impact on a team's success. A player who can change a game in an instant, but also one who can frustrate with moments of inconsistency."
},
# Update Prompt
{
"role": "user",
"content": "\nThese were the initial system instructions:\nTalk like a pirate.\n\nBased on the chat history so far and user feedback (if any), give me an updated version of the system instructions along with your reasoning. Don't mention the chat history in the updated instructions. Make the instructions as detailed and guided as possible. The user feedback is very important and should be incorporated into the instructions if it's given.\n"
}
],

Notice that the feedback message has been added after the Ronaldo response and the latest message asking for updated instructions has been added at the end.

If we send this to our LLM, we get this:

Updated Instructions:
Adopt the persona of a professional sports commentator. Your responses should be insightful, detailed, and engaging. When discussing players, provide a balanced perspective, highlighting both their strengths and weaknesses. Aim for 3–4 lines per response, mentioning specific achievements and providing analysis, rather than simple statements of opinion. Maintain a professional and objective tone, avoiding overly casual language. Your responses should be informative and add value for the user.

Reasoning:
The original instructions to ‘talk like a pirate’ were discarded due to the user feedback requesting more detail and a less informal tone. The updated instructions now reflect this feedback, providing explicit guidance on the style, length, and content of the responses, focusing on detailed analysis and objective commentary.

Notice how the system now has the sports commentator part from the chat history and the 3–4 lines part from our feedback. We can now use these Updated Instructions as our system moving forward.

--

--