Building your own chatbot on AWS with Generative AI

Rohit Vincent
Version 1
Published in
4 min readJul 7, 2023
Generated with Midjourney

While Azure provides various options for building custom chatbots, Amazon Web Services (AWS) also offers compelling solutions. When it comes to AWS, you have the opportunity to create your own chatbot by leveraging Amazon Kendra and any language model (even hosting it yourself with Sagemaker!). In addition to Azure Cognitive Search and OpenAI, Amazon has been actively working on its own chatbot development tools to cater to diverse use cases and provide flexibility for users.

Here is a blog from Amazon to build your chatbot using Retrieval Augmented Generation (RAG). What is RAG? RAG is a technique that combines retrieval-based approaches with generative models to produce more accurate and context-aware responses. It involves retrieving relevant information from a knowledge base or document collection, using it as context along with the user’s input, and then generating a response using a language model. By incorporating retrieval, RAG enables the model to leverage specific information from the company’s data, ensuring the responses align closely with the available knowledge and context. This approach is particularly useful when the data is vast, and the language model’s input prompt needs to be concise due to limitations in length.

How do you build your own chatbot with AWS?

Following is the process flow from a user perspective:

  • The user sends a request to the GenAI app.
  • GenAI app queries the Amazon Kendra index using the user request.
  • Amazon Kendra index returns search results with relevant document excerpts.
  • GenAI app sends the user request and retrieves data from the index to the LLM as context.
  • LLM generates a concise response based on the provided context and retrieved data.
  • The response from the LLM is sent back to the user.
The architecture of a GenAI application with a RAG approach (Source)

The above architecture shows a typical flow of data and response within the application. To create your own:

  • Deploy Kendra on AWS and create indexes for a set of data you want to get answers on. (I’ve used the LLaMA technical paper from Meta as the data for the screenshot below)
Kendra Deployment Screenshot
  • Clone the following repo to use Kendra with either OpenAI, Anthropic Claude or Flan-xl, or Flan-XXL and follow the instructions for installation. [Kendra+LLM Repo]
  • Ensure you set your environmental variables as per the Model you are using along with your AWS details.
Screenshot from Readme

Run the app using the command streamlit run app.py <LLM> where LLM is based on the model; like for openai you call:

streamlit run app.py openai

or for anthropic you call:

streamlit run app.py anthropic

I tested the App out for OpenAI and Flan-XXL. The Anthropic code base has some issues at the moment so will check it out in the future.

Answer Provided using Kendra+OpenAI
Answer Provided using Kendra+FLANXXL

Here is an example of why new models such as GPT-3 are better in such scenarios than older ones like FLAN-XXL. I asked a question about toxicity based on the following paragraph from the LLama paper.

Page 8 of LLAMA

OpenAI GPT-3 got that right whereas Google’s FLAN-XXL got that wrong. Here are the screenshots:

Answer from FLAN-XXL
Answer from OpenAI

In conclusion, building your own chatbot on AWS offers a range of powerful solutions. By leveraging tools such as Amazon Kendra and language models like OpenAI, you can create a chatbot that provides accurate and context-aware responses. Incorporating the Retrieval Augmented Generation (RAG) technique allows the chatbot to retrieve relevant information from a knowledge base, use it as context, and generate responses using a language model. By following the outlined process flow and deploying the necessary components, you can create an effective chatbot that leverages the power of AWS services.

If you’ve reached this point, thank you for reading! Your engagement and support are greatly appreciated as we strive to keep you informed about interesting developments in the AI world and from Version 1 AI Labs. Please 🔔clap, or follow to stay updated.

About the Author:
Rohit Vincent is a Data Scientist at the Innovation & AI Labs at Version 1.

--

--