Advancing Chatbot Intelligence: Unlocking the Power of Step-Back Prompting

9 min readFeb 15, 2024

“Guided Implementation of Step-Back Prompting within LangChain: A Comprehensive Tutorial”

Unlocking the Power of Chain-of-Thought Prompts in AI: A Journey into Coherent Reasoning

Welcome to my blog! If you’re intrigued by the inner workings of large language models (LLMs) and their applications, you’re in the right place. In this space, we’ll delve into a fascinating technique that has revolutionized AI products: Chain-of-Thought (CoT) prompting.

What Is CoT Prompting?

CoT prompting is like a trail of breadcrumbs leading through the forest of complex reasoning. Imagine a chatbot, a question-answering system, or an agent that doesn’t just provide answers but crafts a coherent sequence of intermediate steps. These steps bridge the gap between input and output, making the AI’s responses more meaningful and accurate.

The CoT Magic

But here’s the twist: CoT isn’t just about logic; it’s about storytelling. It weaves together facts, context, and reasoning, creating a narrative that captivates users. Whether you’re building a chatbot for customer support or an AI-driven research assistant, CoT can be your secret weapon.

Navigating Complex Terrain

Yet, even with state-of-the-art LLMs, some challenges persist. Scientific calculations, multi-fact questions — these rocky terrains can trip up even the most sophisticated models. But fear not! We’ll explore how to tackle these hurdles and emerge victorious.

So buckle up! Let’s unravel the mysteries of CoT prompts, one logical breadcrumb at a time. 🚀

Step-Back Prompting: Enhancing Reasoning in AI Models

When dealing with intricate questions — whether they involve physics principles or historical contexts — directly answering them can be challenging due to the level of detail required. Enter the Step-Back Prompting technique, which encourages AI models (such as large language models) to take a step back and consider higher-level concepts before diving into specifics.

Here’s how it works:

Deriving High-Level Abstractions: Instead of immediately addressing the specific example, the AI model abstracts away from it. It identifies broader, more generalized concepts or principles related to the question.
Guided Reasoning: Armed with these high-level abstractions, the AI model then navigates through the problem. It uses these concepts as guideposts for reasoning, leading to more accurate and coherent responses.

Think of it as adjusting the focus on a camera lens: by zooming out to see the bigger picture first, the AI model gains a better understanding of the context before delving into the details.

Below is a quick example comparing the normal reasoning and Step-Back Prompting that is illustrated in the paper from Google Deepmind.

Step-Back Prompting: Unleashing the Power of Abstraction and Reasoning

In our previous discussion, we touched upon how the Chain-of-Thought strategy can enhance LLM responses. However, even with this strategy, there are cases where the LLM might stumble, especially when dealing with complex questions or multiple facts. Let’s dive deeper into one such scenario.

The Estella Leopold Example

Consider a question about Estella Leopold’s educational history. If the LLM directly tackles this question, it might miss crucial details. But what if we encourage the LLM to step back and think more broadly? Here’s how it could play out:

Original Question: “What is Estella Leopold’s education history?”
Step-Back Question: “What relevant information do we need to collect before answering?”
Refined Search: The LLM now knows to look beyond the immediate question. It searches for a broader context, leading it to discover Estella’s educational journey.

The Impact of Abstraction-and-Reasoning

By applying this “abstraction-and-reasoning” approach, the LLM can provide a much more precise answer. In our example, it would correctly reveal that Estella Leopold’s educational path involved:

Graduating with a degree in botany from the University of Wisconsin in 1948.
Attaining a master’s in botany from the University of California, Berkeley in 1950.
Completing a Ph.D. in botany from Yale University in 1955, where she studied under notable palynological pioneers1.

PaLM-2L vs. GPT-4: The Battle of Reasoning

Google’s PaLM-2L model has demonstrated impressive performance gains using Step-Back Prompting. In most cases, it outperforms GPT-4, especially when reasoning through complex problems. The improvement can be as high as 36%2.

Testing the Waters with Llama 2

Now, let’s turn our attention to Llama 2 — an exciting open-source model. I propose putting Step-Back Prompting to the test. Can Llama 2 become a little smarter in solving real-world problems? Let’s find out!

Remember, just like Estella Leopold’s journey from specific examples to broader concepts, our Llama 2 exploration aims to uncover hidden gems of reasoning. 🌟

Implementing Step-Back Prompting with Streamlit

Streamlit, a powerful Python framework for creating interactive web apps from data scripts. With Streamlit, you can build and share beautiful machine learning and data science applications without needing front-end experience. Here’s how we can adapt the architecture.

Inspired by LangChain’s approach, I’ve developed an evaluation app using Streamlit. Our goal remains the same: to create an advanced Question-Answering (QA) system powered by the Llama-V2–70B-chat model.

Block Diagram Overview

Step-Back Question Chain:

User queries and example prompts feed into a template designed for abstracting the question.
We leverage Streamlit’s simple API to generate insightful questions.
The ListOutputParser component processes the output for the next steps.

Retrieval LLM Chain:

The question list from the Step-Back Question Chain is sent to the DuckDuckGo API.
The API retrieves relevant Internet content by searching for each input question.
We combine the retrieved context and user query in another prompt template to generate the final answer.
For this quick test, we’ll use the Llama-V2–70B-Chat model from FireworksAI.

1. Setup

With LCEL (LangChain Expression Language), the coding task is quite simple.

First, let’s install the dependencies with their latest versions.

pip install --upgrade langchain openai chainlit
pip install fireworks-ai duckduckgo-search

Import the necessary modules and set environment variables.

from langchain.chat_models import ChatOpenAI, ChatFireworks
from langchain.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.schema.runnable import RunnableLambda
from langchain.utilities import DuckDuckGoSearchAPIWrapper

import os
import time
import chainlit as cl

os.environ["OPENAI_API_KEY"] = "Your_OpenAI_API_Key"

os.environ["FIREWORKS_API_KEY"] = "Your_Fireworks_API_Key"

chat_fw = ChatFireworks(model="accounts/fireworks/models/llama-v2-70b-chat", temperature=0)
chat_oa = ChatOpenAI(temperature=0)

FireworksAI is a platform offering rapid inference APIs for a range of open-source language models, including Llama-2. When you sign up, you receive a 1 USD credit, which allows you to conduct numerous inference tests at no cost. Additionally, you can create your API key and explore other available open models within your FireworksAI account.

2. StepBack Question Chain

Now, we arrive at a crucial juncture — the step that directs the language model in generating step-back questions. To achieve this, we construct a prompt template using a few example prompts. It’s important to note that in my implementation, I instructed the model to generate two step-back questions instead of the single one defined in LangChain’s cookbook. The comma separating the questions will prove useful for later output parsing.

# Few Shot Examples
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "What can the members of The Police do?, What is lawful arrests?"
    },
    {
        "input": "Jan Sindel’s was born in what country?", 
        "output": "what is Jan Sindel’s personal history?, What are the common countries?"
    },
    {
        "input": "Who is taller, Yao Ming or Shaq?", 
        "output": "what is the height of Yao Ming?, What is the height of Shaq?"
    },
]
# We now transform these to example messages
example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert at world knowledge. 
              Your task is to step back and abstract the original question 
              to some more generic step-back questions, 
              which are easier to answer. Here are a few examples:"""),
    few_shot_prompt,
    ("user", "{question}"),
])

Now, we can create the first chain question_gen_chain to generate step-back questions simply using expression language.

question_gen_chain = prompt | chat_oa | CommaSeparatedListOutputParser()

If you would like to test the quality of step-back question generation, simply invoke this chain with a random question.

question = """If you have 3 moles of nitrogen and 4 moles of hydrogen 
              to produce ammonia, which one will get exhausted first 
              assuming a complete reaction?"""
question_list = question_gen_chain.invoke({"question": question})
print("Question List: ", question_list)

Here is the printed output. It generated two principal questions about chemistry for the original question’s background in the format of a string list.

[‘What is the balanced chemical equation for the reaction between nitrogen and hydrogen to produce ammonia?’, ‘What is the stoichiometry of the reaction?’]

That’s great, let’s make the next chain to answer these questions.

3. Retrieval LLM Chain

In this chain, first, we have to define a search tool by using DuckDuckGo API which is free to use. Define a helper function retriever_list() to search for the questions from the string list and then consolidate the search results.

search = DuckDuckGoSearchAPIWrapper(max_results=4)

def retriever_list(query):
    answer = ''
    ques = ''
    for question in query:
        ques += question
        ques += '/'
        if question[-1] == '?':
            ans = search.run(ques)
            ques = ''
            answer += ans
            time.sleep(2)
    print("Answer: ", answer)
    return answer

Now we can create the chain by adding the search result as step_back_context into the prompt.

response_prompt_template = """You are an expert of world knowledge. 
I am going to ask you a question. Your response should be concise 
and referring to the following context if they are relevant. 
If they are not relevant, ignore them.

{step_back_context}
Original Question: {question}
Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)
chain = {    "step_back_context": question_gen_chain | retriever_list,
    "question": lambda x: x["question"]
} | response_prompt | chat_fw | StrOutputParser()

That’s all code for the step-back prompting using the Llama-2 model. To better understand the performance improvement by using this tactic, we can also create another normal reasoning chain to answer the same question without using a step-back strategy as a baseline.

def retriever(query):
    return search.run(query)

chain_nostep = {
   
        "step_back_context": RunnableLambda(lambda x: x['question']) | retriever,
        "question": lambda x: x["question"]
    } | response_prompt | chat_fw | StrOutputParser()

There is a further step to make this app more convenient than executing chain.invoke() time by time. Let’s give it a quick chatbot UI.

4. Streamlit UI

Streamlit provides a straightforward way to create interactive web apps from Python scripts. Here’s how you can achieve the same functionality using Streamlit:

Python

import streamlit as st

# Define the Step-Back Question Chain
def step_back_question_chain(user_input):
    # Your logic for generating step-back questions goes here
    return step_back_questions# Define the Retrieval LLM Chain
def retrieval_llm_chain(step_back_questions):
    # Your logic for retrieving context and generating the final answer goes here
    return final_answer# Streamlit app
def main():
    st.title("Step-Back Prompting App")    user_input = st.text_input("Enter your question:")
    
    if st.button("Generate Answer"):
        step_back_questions = step_back_question_chain(user_input)
        final_answer = retrieval_llm_chain(step_back_questions)        st.write("Step-Back Prompting:")
        st.write(final_answer)        # You can also display the normal prompting result here if needed
        # response_nostep = normal_prompting_chain(user_input)
        # st.write("Normal Prompting:")
        # st.write(response_nostep)if __name__ == "__main__":
    main()

AI-generated code. Review and use carefully. More info on FAQ.

In this Streamlit app:

The user enters a question.
Upon clicking the “Generate Answer” button, the step_back_question_chain function generates step-back questions.
The retrieval_llm_chain function retrieves context and generates the final answer.
The results are displayed on the screen.

5. Test

Run Your Streamlit App:

In the terminal, navigate to the directory where your stepback_app.py is located.
Enter the following command:
streamlit run stepback_app.py
This will start the Streamlit server.
Open your web browser and visit localhost:8501 (Streamlit’s default port).
Type your query into the app and explore the results.

Conclusion

In our journey through Chain-of-Thought (CoT) prompting and Step-Back techniques, we’ve explored how these strategies empower language models (LLMs) to reason more effectively. By encouraging LLMs to step back, consider broader contexts, and derive high-level abstractions, we pave the way for more accurate and insightful responses.

The Impact of Stepping Back

Coherent Reasoning: CoT and Step-Back Prompting create a breadcrumb trail of logic, weaving together facts, context, and reasoning. It’s like connecting the dots to form a compelling narrative.
Emotional Connection: By appealing to readers’ emotions, we bridge the gap between technical content and human experience. Active voice, powerful adjectives, and empathy play key roles.
Performance Boost: Google’s PaLM-2L model demonstrates up to a 36% improvement using Step-Back Prompting. It’s a testament to the technique’s effectiveness.

Testing the Waters with Llama 2

Remember, sometimes taking a step back is the key to leaping forward. 🚀

If you found this article helpful, don’t hesitate to give it a virtual round of applause! Your support fuels our curiosity and creativity. 🙌🔍

Advancing Chatbot Intelligence: Unlocking the Power of Step-Back Prompting

Unlocking the Power of Chain-of-Thought Prompts in AI: A Journey into Coherent Reasoning

What Is CoT Prompting?

The CoT Magic

Navigating Complex Terrain

Step-Back Prompting: Enhancing Reasoning in AI Models

Step-Back Prompting: Unleashing the Power of Abstraction and Reasoning

The Estella Leopold Example

The Impact of Abstraction-and-Reasoning

PaLM-2L vs. GPT-4: The Battle of Reasoning

Testing the Waters with Llama 2

Implementing Step-Back Prompting with Streamlit

Block Diagram Overview

1. Setup

2. StepBack Question Chain

3. Retrieval LLM Chain

4. Streamlit UI

5. Test

Conclusion

The Impact of Stepping Back

Testing the Waters with Llama 2

Written by Csakash