Build your own digital Doraemon using ReAct LLM Agents

Jason Ng
11 min readOct 29, 2023
Doraemon and the many gadgets he has in his 4D pocket.

Introduction

Doraemon is a familiar character to many, including myself. Growing up, I used to be amazed by the many gadgets he has in his pocket and that he knows when to use them. With the growing trend of Large language Models (LLMs), you too can build one that behaves the same way!

We will be building an agent, where its meaning is nicely defined here. We will be focusing on a common type of agent called ReAct, which uses an LLM as its main engine to break down tasks, to reason and to use specific set of tools appropriately. You can read more about the ReAct framework in this paper.

In a previous article, I have mentioned about building a simple Retrieval Augmented Generation (RAG) tool involving food reviews. Now, it’s time to take a step further and build a food review agent that is not only able to retrieve relevant food review data, but also handle geolocation filtering whenever necessary.

The Plan

We will first define the LLM will be using as the core engine of our agent, before defining functions and methods to be incorporated as tools. Thereafter, we will add them to an agent which we will initialise. Some customisation will be made along the way to make results better. Tools to be included in this agent are:

a) Database retriever

b) Geolocation retriever

c) Database retriever with geolocation filtering

Some techniques from this wonderful article here, such as metadata-filtering and prompt engineering, will be applied along the way to improve results from the retrieval process. Let’s begin!

1. Defining our LLM

To keep things simple and hassle free, we will be using OpenAI GPT-3.5 Turbo model. I will be tapping on it via Azure OpenAI. You can also use the API service directly from OpenAI or even open-sourced LLMs (which you then need to worry about compute resources).

import os
from langchain.chat_models import AzureChatOpenAI

# define env variables for AzureOpenAI model
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_BASE"] = "YOUR_ENDPOINT_HERE"
os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY_HERE"
os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview"

# Instatiate LLM
llm = AzureChatOpenAI(
deployment_name='YOUR_DEPLOYMENT_NAME_HERE',
temperature=0
)

2. Create Tools

We will first create a vector store that contains the food reviews and their embeddings. The embeddings will be important for the agent to retrieve relevant results for the search query. Deeplake vector store will be used for this experiment due to its easy implementation.

Using similar CSV data source to the one used in my previous article, we can create the vector store. One customisation needed is to add the coordinates of the food venue locations to the metadata of the documents. This will be useful for the tool with geolocation filtering later on.

import pandas as pd
from langchain.vectorstores import DeepLake
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.document_loaders.csv_loader import CSVLoader
# instantiate OpenAIEmbeddings to use to create embeddings from document context
embeddings = OpenAIEmbeddings(deployment="EMBEDDING_DEPLOYMENT_NAME", chunk_size=16)
# instantiate CSV loader and load food reviews with review link as source
loader = CSVLoader(file_path='final_data.csv', csv_args={
"delimiter": ",",
}, encoding='utf-8', source_column='review_link')
data = loader.load()
# augment data's metadata with lat and long values; this is needed for one of our tools later on
df = pd.read_csv("final_data.csv", index_col=False)

df_lat = df.lat.values.tolist()
df_long = df.long.values.tolist()

for lat, long, item in zip(df_lat, df_long, data):
item.metadata["lat"] = lat
item.metadata["long"] = long
# create deeplake db
db = DeepLake(
dataset_path="./my_deeplake/", embedding=embeddings, overwrite=True
)
db.add_documents(data)

a) Database Retriever

We shall now create our first tool: the food review database retriever. Instead of simply using the as_retriever() method of the vector store object, I will be defining a custom retriever so that I can filter out documents that are below a certain similarity threshold. There is a similarity_score_threshold method available within the Langchain library, but I have tried it and it seems to be buggy at the time of writing.

# defining a normal retriever
def search(query):
search_results = db.similarity_search_with_score(k = 5,
query = query,
distance_metric = 'cos'
)

filtered_res = list(filter(lambda x: x[1] >= 0.8, search_results))
filtered_res = [t[0] for t in filtered_res]

return filtered_res

This simple search function can now act as our tool to use if the user would like to retrieve relevant food reviews based on the text query.

b) Geolocation Retriever

We will now move on to create our geolocation retriever tool. It needs to take in a name of a place and return its latitude and longitude coordinates. I will be using Foursquare API due to its free API service, to search and return the coordinates. You can easily get an API key for free from the link above.

import requests
# headers for API call which includes FQ API key
headers = {
"Accept": "application/json",
"Authorization": "YOUR_API_KEY"
}
# function to retrieve coordinates based on most relevant search on foursquares platform
def get_coordinates(address):
url = f'https://api.foursquare.com/v3/places/search?query={address.lower().strip().replace(" ", "%20")}&near=singapore'

req = requests.request("GET", url, headers=headers)
results_dict = eval(req.text)

result_name = results_dict['results'][0]['name']
result_geo = results_dict['results'][0]['geocodes']['main']

return (result_geo['latitude'], result_geo['longitude']), result_name

c) Database retriever with geolocation filtering

Lastly, we can create the database retriever with geolocation filtering. Users sometimes want to find a relevant food place based on a certain location. This would mean a need for a geolocational filtering of the database. Based on the calculation of haversine distance between the coordinates of the said location and the coordinates of all places in the database, we can create a filter to remove those that are further than a set distance.

from haversine import haversine as hs
import deeplake
# create a filter function for deeplake to use
# deeplake.compute decorator is needed for filter function with multiple arguments
@deeplake.compute
def filter_fn(x, coor):
venue_lat = float(x['metadata'].data()['value']['lat'])
venue_long = float(x['metadata'].data()['value']['long'])

dist = hs(coor, (venue_lat, venue_long))
return dist <= 0.8 # listings that are >800m away from target location is filtered out
# create a function to do similarity search with filtering using deeplake.search
def search_with_filter(prompt_coor_string):

# seperate search query and coordinates string input and create coordinate float tuple
params_ls = prompt_coor_string.split(" | ")
prompt = params_ls[0].replace('"', '')
coor_ls = params_ls[1].split(", ")
coor = (float(coor_ls[0]), float(coor_ls[1]))

search_results = db.similarity_search_with_score(k = 5,
query = prompt,
filter = filter_fn(coor),
distance_metric = 'cos'
)

# only keep reviews that have a similarity score of 0.8 and above
filtered_res = list(filter(lambda x: x[1] >= 0.8, search_results))
filtered_res = [t[0] for t in filtered_res]

return filtered_res

Based on the example below, you can see that only relevant omakase restaurants that are near to Tanjong Pagar MRT Station were returned.

# an example on how the function will be used
# coordinates for Tanjong Pagar MRT Station is used
search_with_filter('"omakase courses" | 1.276525, 103.845725')
[Document(page_content="place_title: Kei Hachi\nplace_url: https://www.burpple.com/kei-hachi?bp_ref=%2Ff%2FWr8X_PCG\nfood_desc_title: Best Meal I've Had So Far\nfood_desc_body: Just some random photos of dishes served during their Kei Hachi Lunch Omakase ($128) because its too hard to choose specific favourites from their beautiful course meal. All of their food in the course are right on point and so darn delicious. Each food item is presented like a form of art and paired with the beautiful ambience, this is one hell of a treat. You get a huge variety of different food preparation styles and definitely a filling meal by the end of the course. Loved the hospitality of the chefs and the servers. Highly recommended if you love Japanese food and would like a good treat! Indeed the best meal I have ever had so far 😍😍\nreview_link: https://www.burpple.com/f/R6qr9qKk\npos_prob: 0.9908563\nnum_stars: 5\nvenue_price: ~$130/pax\nvenue_tag: Date Night\nFine Dining\nJapanese\nvenue_loc: https://www.google.com/maps/search/?api=1&query=1.279512,103.8415799\nlat: 1.279512\nlong: 103.8415799\nnearest_stations: Maxwell, Outram Park, Chinatown, Tanjong Pagar", metadata={'source': 'https://www.burpple.com/f/R6qr9qKk', 'row': 401, 'lat': 1.279512, 'long': 103.8415799}),
Document(page_content="place_title: KYUU By Shunsui\nplace_url: https://www.burpple.com/kyuu-by-shunsui?bp_ref=%2Ff%2F9TUCRyhw\nfood_desc_title: Do come here if you love your ikura!\nfood_desc_body: Have heard about their unique omakase where you get enormous quantities of ikura and we were pretty impressed!\nThe standard omakase ($128) comprises of 9 courses and there is indeed a huge variety. From sashimi to tempera to grilled dishes, they have them all. These are just some of the dishes we had during the course and I must say I was very surprised by the beauty of the plating of their dishes!\nAs for the food quality, all dishes were of decent quality definitely. The sashimi pieces were decently fresh and the Wagyu Beef was pretty tender but not melt in the mouth though. Though I must admit I was not expecting much when I was served corn, but that is probably the sweetest, juiciest and most delicious corn I've had.\nIf you love your ikura, this is definitely a place to check out. To make it less painful on your wallet, you can get some discounts via @chopesg vouchers! Definitely a place to bring your loved ones for a celebration!\nreview_link: https://www.burpple.com/f/bAd77B8k\npos_prob: 0.8500501\nnum_stars: 4\nvenue_price: ~$130/pax\nvenue_tag: Late Night\nJapanese\nDinner With Drinks\nvenue_loc: https://www.google.com/maps/search/?api=1&query=1.2799799,103.841516\nlat: 1.2799799\nlong: 103.841516\nnearest_stations: Maxwell, Outram Park, Chinatown, Tanjong Pagar", metadata={'source': 'https://www.burpple.com/f/bAd77B8k', 'row': 207, 'lat': 1.2799799, 'long': 103.841516}),
Document(page_content="place_title: Teppei Japanese Restaurant (Orchid Hotel)\nplace_url: https://www.burpple.com/teppei-japanese-restaurant?bp_ref=%2Ff%2FSjq7Uauy\nfood_desc_title: Very Good Omakase, Worth The Price\nfood_desc_body: One will find a gastronomical experience here definitely, as you will experience so many different flavour profiles from their dinner omakase ($100). We realised that we really cannot pick a favourite among the dishes served as most were so darn good. All seafood ingredients served are really fresh, and you really don't need the soya sauce because they are all so flavourful! There is a total of about 15-17 courses, and although most were small bites, they were more than enough to make us full. In fact many in the restaurant were saying that they were already filled towards the end of the course! Really felt like it's worth the price 😊\nreview_link: https://www.burpple.com/f/PWJbmM5Z\npos_prob: 0.95687157\nnum_stars: 5\nvenue_price: ~$100/pax\nvenue_tag: Sushi\nChirashi\nSeafood\nDate Night\nJapanese\nvenue_loc: https://www.google.com/maps/search/?api=1&query=1.276993,103.843891\nlat: 1.276993\nlong: 103.843891\nnearest_stations: Tanjong Pagar, Maxwell, Shenton Way, Outram Park, Chinatown", metadata={'source': 'https://www.burpple.com/f/PWJbmM5Z', 'row': 413, 'lat': 1.276993, 'long': 103.843891})]

3. Create the agent

Now that we have defined functions and methods to use as tools in the agent, we can put them all together. In order to initialise the set of tools that the agent can use, there is a need to further define a name and description for each of them. Both are extremely important for the ReAct agent, as it heavily relies on them for the LLM to understand what each tool is for and when it should be used. Properly crafting it is an iterative process, and there is a need to use different test cases to improve them.

Notice that I defined the format of the inputs for functions that requires more than 1 parameter. This is not necessary if I define a STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION agent type as that would allow multiple inputs, but it seems to be buggy at the point of writing. Feel free to try that out if it works for you!

# initialise the 3 tools to be added to the agent.
# Note: that the name and description of the tool are very important for a reAct agent type, as it relies on them heavily to decide on when to use the tools
tools = [
Tool(
name="Database Search",
func=search,
description="Useful for querying food review data. Input should be the user text query to find relevant food review articles."
),
Tool(
name="Database Search with Distance Filter",
func=search_with_filter,
description="Useful for searching food places near specific locations. This function is to be used after Geolocation Search. The input string should be a text search query and a set of latitude and longitude coordinates seperated by |."
),
Tool(
name="Geolocation Search",
func=get_coordinates,
description="Useful to retrieve latitude and longitude coordinates and name of place from a defined location. This is used before Database Search with Distance Filter. Input should be the place location given by the user. Output should be a tuple containing latitude and longitude coordinates and name of place."
)
]

I realised that there were certain issues with the default agent, where it might sometimes use its own knowledge to answer some questions and that there is a need to standardise the output format to the one discussed in my previous article. These prompts can be solved by editing the prompt template of the agent. You can do so by editing its prompt prefix.

# Define prefix of LLM agent
# Note: this is where you can prompt engineer the template for the agent, so that the agent understands clearly on what tasks it aims to do and how it should format its answers.
PREFIX = """
Answer the following questions as best you can. You have access to the following tools below that can help to find food reviews. Only answer questions that are related to the use of the tools given.
If the question is unrelated, reject the question. If answer is not found, just say that you do not have any relevant results. Return the final answer if you think you are ready. In your final answer, you MUST list the reviews in the following format:

Here are my recommendations:
🏠 [Name of place]
<i>[venue tags]</i>
✨ Avg Rating: [Rating of venue]
💸 Price: [Estimated price of dining at venue] (this is optional. If not found or not clear, use a dash instead.)
📍 <a href=[Location of venue] ></a>
📝 Reviews:
[list of review_link, seperated by linespace] (Use this format: 1. <a href=[review_link] >[food_desc_title text]</a>)

If you cannot find any reviews to respond to the user, just say that you don't know.
"""

Now what is left to do is to initialise the agent with the correct parameters and test it out.

# Construct the agent
# Note: sometimes the agent will hallucinate and not format its answer properly. Setting handle_parsing_error to True will allow the agent to check its own formatting.
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
handle_parsing_errors=True,
agent_kwargs={
'prefix': PREFIX
}
)

Example: Do you know any restaurants that have omakase courses near Tanjong Pagar MRT?

Running the agent with the specific example above with verbose, you can see that the LLM tries to figure out the steps needed for the task provided, and tries to identify the tool needed for each step.

Thought process of the LLM before coming out with the finalanswer.

The output is as intended, with restaurants near Tanjong Pagar MRT Station that offers omakase courses that I have been before. The ouput format is correct as well in the HTML format I have specified in the prefix template.

Here are my recommendations:
🏠 Kei Hachi
<i>Date Night, Fine Dining, Japanese</i>
✨ Avg Rating: 5
💸 Price: ~$130/pax
📍 <a href=https://www.google.com/maps/search/?api=1&query=1.279512,103.8415799 ></a>
📝 Reviews:
1. <a href=https://www.burpple.com/f/R6qr9qKk >Best Meal I've Had So Far</a>

🏠 KYUU By Shunsui
<i>Late Night, Japanese, Dinner With Drinks</i>
✨ Avg Rating: 4
💸 Price: ~$130/pax
📍 <a href=https://www.google.com/maps/search/?api=1&query=1.2799799,103.841516 ></a>
📝 Reviews:
1. <a href=https://www.burpple.com/f/bAd77B8k >Do come here if you love your ikura!</a>

🏠 Teppei Japanese Restaurant (Orchid Hotel)
<i>Sushi, Chirashi, Seafood, Date Night, Japanese</i>
✨ Avg Rating: 5
💸 Price: ~$100/pax
📍 <a href=https://www.google.com/maps/search/?api=1&query=1.276993,103.843891 ></a>
📝 Reviews:
1. <a href=https://www.burpple.com/f/PWJbmM5Z >Very Good Omakase, Worth The Price</a>

Conclusion

From the walkthrough above, you can see how the LLM is able to break down tasks and use the tools in its arsenal appropriately, just like what Doremon does in the anime series. I believe that agents can be helpful in many ways, such as condensing multi-tool and multi-modal platforms into a conversational bot where users can simply upload their data and the agent is able decide on which tools to use based on the data type and its understanding of the task provided by the user.

ReAct type agents are just one of the many type of agents we can create and explore. I have seen agents that are able to learn to thrive in a game and multiple agents with differentiated roles working together to perform in a simulated company. There are much to explore and I hope I have inspired you to. Happy coding!

Credits to:

Doraemon GIF: https://tenor.com/en-SG/view/find-out-lost-gif-25362727

Additional materials:

Full Python Notebook: https://github.com/jasonngap1/llm-agent-example.git

My previous article on document QA (RAG): https://medium.com/@jasonisveryhappy/document-qa-using-large-language-models-llms-933b73c9df8f

My food review bot: https://t.me/jasonthefoodie_bot

--

--

Jason Ng

Data Scientist | Software Developer | AI Engineer. Love making AI products.