LlamaIndex π¦ Q&A over your data π using Amazon Bedrock and Streamlit
Semantic Search using Retrieval Augmented Generation (RAG) with LlamaIndex π¦
π Mar 8, 2024 β content update based on post-LlamaIndex π¦ v0.10.0 release.
π Have you ever wanted to ask questions about your own data and get quick, accurate answers? Well, now you can, with the power of LlamaIndex π¦! In this post, weβll show you how to build a simple question-answering system over your data π using Amazon Bedrock and Streamlit.
Specifically, weβll demonstrate Retrieval Augmented Generation (RAG) using LlamaIndex π¦, Amazon Bedrock, and Streamlit. LlamaIndex has native integration with Amazon Bedrock, both for Large Language Models (LLMs) and Embeddings models.
Weβll walk through a code sample using Streamlit to build a simple Web user interface. In just a few lines of Python code, you can set up a Question Answering system tailored to your data. No deep learning expertise is required!
π Letβs dive in and see how easy it is to start asking questions over your data.
Retrieval Augmented Generation (RAG)
Letβs begin by exploring the fundamental concepts of Retrieval Augmented Generation (RAG) and its stages, drawing from LlamaIndexβs excellent High-Level Concepts write-up.
High level Concepts
Stages within RAG
π Reference: https://docs.llamaindex.ai/en/latest/getting_started/concepts.html
Step 1 β Install dependencies
requirements.txt
# requirements.txt
boto3
llama-index
llama-index-llms-bedrock
llama-index-embeddings-bedrock
pip install -r requirements.txt
Step 2 β Streamlit Run
File structure:
- main.py
- /data folder π contains your data
main.py
# main.py
import os
import streamlit as st
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage
)
from llama_index.core.settings import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding, Models
# ------------------------------------------------------------------------
# LlamaIndex - Amazon Bedrock
llm = Bedrock(model = "anthropic.claude-v2")
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v1")
Settings.llm = llm
Settings.embed_model = embed_model
# ------------------------------------------------------------------------
# Streamlit
# Page title
st.set_page_config(page_title='LlamaIndex Q&A over you data π')
# Clear Chat History fuction
def clear_screen():
st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]
with st.sidebar:
st.title('LlamaIndex π¦')
st.subheader('Q&A over you data π')
st.markdown('[Amazon Bedrock](https://aws.amazon.com/bedrock/) - The easiest way to build and scale generative AI applications with foundation models')
st.divider()
streaming_on = st.toggle('Streaming')
st.button('Clear Screen', on_click=clear_screen)
@st.cache_resource(show_spinner=False)
def load_data():
with st.spinner(text = "Loading and indexing your data. This may take a while..."):
PERSIST_DIR = "storage"
# check if storage already exists
if not os.path.exists(PERSIST_DIR):
# load the documents and create the index
documents = SimpleDirectoryReader(input_dir="data", recursive=True).load_data()
index = VectorStoreIndex.from_documents(documents)
# persistent storage
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# load the existing index
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
return index
# Create Index
index = load_data()
# Store LLM generated responses
if "messages" not in st.session_state.keys():
st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]
# Display or clear chat messages
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.write(message["content"])
# Chat Input - User Prompt
if prompt := st.chat_input():
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.write(prompt)
if streaming_on:
# Query Engine - Streaming
query_engine = index.as_query_engine(streaming=True)
with st.chat_message("assistant"):
placeholder = st.empty()
full_response = ''
streaming_response = query_engine.query(prompt)
for chunk in streaming_response.response_gen:
full_response += chunk
placeholder.markdown(full_response)
placeholder.markdown(full_response)
st.session_state.messages.append({"role": "assistant", "content": full_response})
else:
# Query Engine - Query
query_engine = index.as_query_engine()
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
response = query_engine.query(prompt)
st.write(response.response)
st.session_state.messages.append({"role": "assistant", "content": response.response})
streamlit run main.py
Streamlit UI
Letβs use Amazon Bedrock User Guide and ask questions about it.
Sample Questions and Answers:
Whether youβre working with large datasets, customer inquiries, or any other data-driven application, this blog post has demonstrated how LlamaIndex π¦, Amazon Bedrock, and Streamlit can empower you to build a powerful question-answering system with ease. Embrace the power of these tools and unlock new possibilities in data exploration and analysis.