LlamaIndex πŸ¦™ Q&A over your data πŸ“‚ using Amazon Bedrock and Streamlit

Semantic Search using Retrieval Augmented Generation (RAG) with LlamaIndex πŸ¦™

David Min
4 min readSep 27, 2023
Stable Diffusion AI Art (Stable Diffusion XL)

πŸ‘‰ Mar 8, 2024 β€” content update based on post-LlamaIndex πŸ¦™ v0.10.0 release.

πŸ‘‹ Have you ever wanted to ask questions about your own data and get quick, accurate answers? Well, now you can, with the power of LlamaIndex πŸ¦™! In this post, we’ll show you how to build a simple question-answering system over your data πŸ“‚ using Amazon Bedrock and Streamlit.

Specifically, we’ll demonstrate Retrieval Augmented Generation (RAG) using LlamaIndex πŸ¦™, Amazon Bedrock, and Streamlit. LlamaIndex has native integration with Amazon Bedrock, both for Large Language Models (LLMs) and Embeddings models.

We’ll walk through a code sample using Streamlit to build a simple Web user interface. In just a few lines of Python code, you can set up a Question Answering system tailored to your data. No deep learning expertise is required!

πŸš€ Let’s dive in and see how easy it is to start asking questions over your data.

Retrieval Augmented Generation (RAG)

Let’s begin by exploring the fundamental concepts of Retrieval Augmented Generation (RAG) and its stages, drawing from LlamaIndex’s excellent High-Level Concepts write-up.

High level Concepts

Source: https://docs.llamaindex.ai/en/latest/getting_started/concepts.html

Stages within RAG

Source: https://docs.llamaindex.ai/en/latest/getting_started/concepts.html

πŸ‘‰ Reference: https://docs.llamaindex.ai/en/latest/getting_started/concepts.html

Step 1 β€” Install dependencies

requirements.txt

# requirements.txt
boto3
llama-index
llama-index-llms-bedrock
llama-index-embeddings-bedrock
pip install -r requirements.txt

Step 2 β€” Streamlit Run

File structure:

  • main.py
  • /data folder πŸ“‚ contains your data

main.py

# main.py

import os
import streamlit as st
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage
)
from llama_index.core.settings import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding, Models

# ------------------------------------------------------------------------
# LlamaIndex - Amazon Bedrock

llm = Bedrock(model = "anthropic.claude-v2")
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v1")

Settings.llm = llm
Settings.embed_model = embed_model

# ------------------------------------------------------------------------
# Streamlit

# Page title
st.set_page_config(page_title='LlamaIndex Q&A over you data πŸ“‚')

# Clear Chat History fuction
def clear_screen():
st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

with st.sidebar:
st.title('LlamaIndex πŸ¦™')
st.subheader('Q&A over you data πŸ“‚')
st.markdown('[Amazon Bedrock](https://aws.amazon.com/bedrock/) - The easiest way to build and scale generative AI applications with foundation models')
st.divider()
streaming_on = st.toggle('Streaming')
st.button('Clear Screen', on_click=clear_screen)

@st.cache_resource(show_spinner=False)
def load_data():
with st.spinner(text = "Loading and indexing your data. This may take a while..."):
PERSIST_DIR = "storage"
# check if storage already exists
if not os.path.exists(PERSIST_DIR):
# load the documents and create the index
documents = SimpleDirectoryReader(input_dir="data", recursive=True).load_data()
index = VectorStoreIndex.from_documents(documents)
# persistent storage
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# load the existing index
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
return index

# Create Index
index = load_data()

# Store LLM generated responses
if "messages" not in st.session_state.keys():
st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

# Display or clear chat messages
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.write(message["content"])

# Chat Input - User Prompt
if prompt := st.chat_input():
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.write(prompt)

if streaming_on:
# Query Engine - Streaming
query_engine = index.as_query_engine(streaming=True)
with st.chat_message("assistant"):
placeholder = st.empty()
full_response = ''
streaming_response = query_engine.query(prompt)
for chunk in streaming_response.response_gen:
full_response += chunk
placeholder.markdown(full_response)
placeholder.markdown(full_response)
st.session_state.messages.append({"role": "assistant", "content": full_response})

else:
# Query Engine - Query
query_engine = index.as_query_engine()
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
response = query_engine.query(prompt)
st.write(response.response)
st.session_state.messages.append({"role": "assistant", "content": response.response})
streamlit run main.py

Streamlit UI

LlamaIndex Streamlit UI

Let’s use Amazon Bedrock User Guide and ask questions about it.

Sample Questions and Answers:

LlamaIndex πŸ¦™ Q&A over your data πŸ“‚ β€” Sample Output
LlamaIndex πŸ¦™ Q&A over your data πŸ“‚ β€” Sample Output

Whether you’re working with large datasets, customer inquiries, or any other data-driven application, this blog post has demonstrated how LlamaIndex πŸ¦™, Amazon Bedrock, and Streamlit can empower you to build a powerful question-answering system with ease. Embrace the power of these tools and unlock new possibilities in data exploration and analysis.

Useful Links

--

--