LlamaIndex 🦙 Q&A over your data 📂 using Amazon Bedrock and Streamlit

Semantic Search using Retrieval Augmented Generation (RAG) with LlamaIndex 🦙

4 min readSep 27, 2023

Stable Diffusion AI Art (Stable Diffusion XL)

👉 Mar 8, 2024 — content update based on post-LlamaIndex 🦙 v0.10.0 release.

👋 Have you ever wanted to ask questions about your own data and get quick, accurate answers? Well, now you can, with the power of LlamaIndex 🦙! In this post, we’ll show you how to build a simple question-answering system over your data 📂 using Amazon Bedrock and Streamlit.

Specifically, we’ll demonstrate Retrieval Augmented Generation (RAG) using LlamaIndex 🦙, Amazon Bedrock, and Streamlit. LlamaIndex has native integration with Amazon Bedrock, both for Large Language Models (LLMs) and Embeddings models.

We’ll walk through a code sample using Streamlit to build a simple Web user interface. In just a few lines of Python code, you can set up a Question Answering system tailored to your data. No deep learning expertise is required!

🚀 Let’s dive in and see how easy it is to start asking questions over your data.

Retrieval Augmented Generation (RAG)

Let’s begin by exploring the fundamental concepts of Retrieval Augmented Generation (RAG) and its stages, drawing from LlamaIndex’s excellent High-Level Concepts write-up.

High level Concepts

Source: https://docs.llamaindex.ai/en/latest/getting_started/concepts.html

Stages within RAG

👉 Reference: https://docs.llamaindex.ai/en/latest/getting_started/concepts.html

Step 1 — Install dependencies

requirements.txt

# requirements.txt
boto3
llama-index
llama-index-llms-bedrock
llama-index-embeddings-bedrock

pip install -r requirements.txt

Step 2 — Streamlit Run

File structure:

main.py
/data folder 📂 contains your data

main.py

# main.py

import os
import streamlit as st
from llama_index.core import ( 
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
    load_index_from_storage
)
from llama_index.core.settings import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding, Models

# ------------------------------------------------------------------------
# LlamaIndex - Amazon Bedrock

llm = Bedrock(model = "anthropic.claude-v2")
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v1")

Settings.llm = llm
Settings.embed_model = embed_model

# ------------------------------------------------------------------------
# Streamlit

# Page title
st.set_page_config(page_title='LlamaIndex Q&A over you data 📂')

# Clear Chat History fuction
def clear_screen():
    st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

with st.sidebar:
    st.title('LlamaIndex 🦙')
    st.subheader('Q&A over you data 📂')
    st.markdown('[Amazon Bedrock](https://aws.amazon.com/bedrock/) - The easiest way to build and scale generative AI applications with foundation models')
    st.divider()
    streaming_on = st.toggle('Streaming')
    st.button('Clear Screen', on_click=clear_screen)

@st.cache_resource(show_spinner=False)
def load_data():
    with st.spinner(text = "Loading and indexing your data. This may take a while..."):
        PERSIST_DIR = "storage"
        # check if storage already exists
        if not os.path.exists(PERSIST_DIR):
            # load the documents and create the index
            documents = SimpleDirectoryReader(input_dir="data", recursive=True).load_data()
            index = VectorStoreIndex.from_documents(documents)
            # persistent storage 
            index.storage_context.persist(persist_dir=PERSIST_DIR)
        else:
            # load the existing index
            storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
            index = load_index_from_storage(storage_context)
        return index

# Create Index
index = load_data()

# Store LLM generated responses
if "messages" not in st.session_state.keys():
    st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

# Display or clear chat messages
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.write(message["content"])

# Chat Input - User Prompt 
if prompt := st.chat_input():
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.write(prompt)

    if streaming_on:
        # Query Engine - Streaming
        query_engine = index.as_query_engine(streaming=True)
        with st.chat_message("assistant"):
            placeholder = st.empty()
            full_response = ''
            streaming_response = query_engine.query(prompt)
            for chunk in streaming_response.response_gen:
                full_response += chunk
                placeholder.markdown(full_response)
            placeholder.markdown(full_response)
            st.session_state.messages.append({"role": "assistant", "content": full_response})

    else:
        # Query Engine - Query
        query_engine = index.as_query_engine()
        with st.chat_message("assistant"):
            with st.spinner("Thinking..."):
                response = query_engine.query(prompt)
                st.write(response.response)
                st.session_state.messages.append({"role": "assistant", "content": response.response})

streamlit run main.py

Streamlit UI

Let’s use Amazon Bedrock User Guide and ask questions about it.

Data: Amazon Bedrock User Guide

Sample Questions and Answers:

LlamaIndex 🦙 Q&A over your data 📂 — Sample Output

Whether you’re working with large datasets, customer inquiries, or any other data-driven application, this blog post has demonstrated how LlamaIndex 🦙, Amazon Bedrock, and Streamlit can empower you to build a powerful question-answering system with ease. Embrace the power of these tools and unlock new possibilities in data exploration and analysis.