Building Real-Time Financial News RAG Chatbot with Gemini and Qdrant

8 min readMar 29, 2024

Introduction

Most of us have invested time and effort in buying real estate, mutual funds, bonds, and more. Before investing, we search for a suitable banking partner. For long-term investments, we research which banks are performing well minus scams or frauds, and offer reasonable interest rates and returns. So we need to stay updated with financial news.

What if you had at your disposal a real-time financial news chatbot that could provide you with all the news related to finance and economics? Does that sound interesting? Retrieval Augmented Generation (RAG) has now made this possible. We can leverage large language models and vector databases to get our queries answered.

Let’s create a real-time financial news RAG chatbot and see if it accurately answers questions using available data. It can be real-time by feeding a vector database with the latest news.

Real-Time Financial News RAG Chatbot Using Gemini

As our starting point, we took the Indian Financial News dataset. This dataset consists of financial news related to Indian banks. It is updated till 26th May 2020.

Before we get started, let’s install the required dependencies.

%pip install -q llama-index 'google-generativeai>=0.3.0' qdrant_client llama-index-embeddings-fastembed fastembed llama-index-llms-gemini

Preparing the Node

As the dataset is a CSV file, let’s load the data.

from llama_index.core import SimpleDirectoryReader
docs = SimpleDirectoryReader("Dataset").load_data()

All the documents are ready; now we will split the sentences into defined chunk sizes.

from llama_index.core.node_parser.text import SentenceSplitter
# Initialize the SentenceSplitter with a specific chunk size
text_parser = SentenceSplitter(chunk_size=1024)
text_chunks = [] # This will hold all the chunks of text from all documents
doc_idxs = [] # This will keep track of the document each chunk came from
for doc_idx, doc in enumerate(docs):
 # Split the current document's text into chunks
 cur_text_chunks = text_parser.split_text(doc.text)
 
 # Extend the list of all text chunks with the chunks from the current document
 text_chunks.extend(cur_text_chunks)
 
 # Extend the document index list with the index of the current document, repeated for each chunk
 doc_idxs.extend([doc_idx] * len(cur_text_chunks))

Then, we will create a text node object and assign the metadata to it. We will store all the nodes in one node list.

from llama_index.core.schema import TextNode
nodes = [] # This will hold all TextNode objects created from the text chunks
# Iterate over each text chunk and its index
for idx, text_chunk in enumerate(text_chunks):
 # Create a TextNode object with the current text chunk
 node = TextNode(text=text_chunk)
 
 # Retrieve the source document using the current index mapped through doc_idxs
 src_doc = docs[doc_idxs[idx]]
 
 # Assign the source document's metadata to the node's metadata attribute
 node.metadata = src_doc.metadata
 
 # Append the newly created node to the list of nodes
 nodes.append(node)

Initializing the Qdrant Vector Store

To store the nodes, we need a vector store. Here, we have chosen Qdrant as our vector store. Qdrant is a high-performance vector database with all the specific features that a vector store should have. It is fast and accurate by utilizing the HNSW algorithm for approximate nearest neighbor search. Qdrant supports additional payload and filters based on payload values by providing an easy-to-use API. Additionally, it supports docker installation, is equipped with in-memory storage of vectors, is cloud-native, and scales horizontally. Developed in the Rust language, Qdrant implements dynamic query planning and payload data indexing.

First, we’ll create a collection in the vector store index.

from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import Settings
from llama_index.core import StorageContext
import qdrant_client

# Create a local Qdrant vector store
client = qdrant_client.QdrantClient(path="financialnews")
vector_store = QdrantVectorStore(client=client, collection_name="collection")

Gemini Embeddings and Text Model

The vector store and nodes are ready, but the vector store is not going to directly accept the nodes. They require embeddings, and for embeddings, we are using the Gemini embedding model here. We’ll be leveraging the Gemini LLM, which is a very capable family of multimodal models. Built on the transformer architecture and trained on TPUs, the Gemini model excels in summarization, reading comprehension tasks with per-task fine-tuning, multilinguality, long context, coding, complex reasoning, mathematics, and of course, multimodality.

We’ll initiate the Google API key, which you can obtain from Google AI Studio.

%env GOOGLE_API_KEY = "your-api-key"
import os
GOOGLE_API_KEY = "your-api-key" # add your GOOGLE API key here
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

Now, using the API key, we will generate the embeddings using the FastEmbed embedding model and the Gemini LLM in Llamaindex’s Settings.

from llama_index.embeddings.fastembed import FastEmbedEmbedding
embed_model = FastEmbedEmbedding(model_name="BAAI/bge-small-en-v1.5")
for node in nodes:
 node_embedding = embed_model.get_text_embedding(
 node.get_content(metadata_mode="all")
 )
 node.embedding = node_embedding
from llama_index.llms.gemini import Gemini
Settings.embed_model = embed_model
Settings.llm = Gemini(model="models/gemini-pro")
Settings.transformations = [SentenceSplitter(chunk_size=1024)]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(
 nodes=nodes,
 storage_context=storage_context,
transformations=Settings.transformations,
)

The vector store is saved in the storage context, and the index has been initiated with it and the nodes.

HyDE Query Transformation

Now, we’ll initiate the vector query engine with a response synthesizer and vector retriever. Vector Retriever is initiated with the vector index retriever in which the index was included. Response Synthesizer generates a response from an LLM, using a user query and a given set of text chunks. The output of a response synthesizer is a Response Object.

from llama_index.core import get_response_synthesizer
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import VectorIndexRetriever
vector_retriever = VectorIndexRetriever(index=index, similarity_top_k=2)
response_synthesizer = get_response_synthesizer()
vector_query_engine = RetrieverQueryEngine(
 retriever=vector_retriever,
 response_synthesizer=response_synthesizer,
)

We will employ the HyDE query transformer for advanced retrieval. HyDE (Hypothetical Document Embeddings) facilitates zero-shot prompt-based instruction-following in a large language model. It generates a hypothetical document encapsulating relevant text patterns, converts them into embedding vectors, and averages them to create a single embedding. This procedure identifies the corresponding actual embedding through vector similarity in the document embedding space, thereby eliminating the need for a retrieval step involving querying an input and obtaining a document from a large database.

The HyDE query transformation assists in delivering responses directly and concisely.

from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.query_engine import TransformQueryEngine
hyde = HyDEQueryTransform(include_original=True)
hyde_query_engine = TransformQueryEngine(vector_query_engine, hyde)

Leveraging Gradio UI for Chatbot Implementation

For deploying a chatbot, we will use Gradio.

def queries(query_str):
 response = hyde_query_engine.query(query_str)
 return str(response)
import gradio as gr
import os
gr.close_all()
with gr.Blocks(theme=gr.themes.Soft()) as demo:
 gr.Markdown(
 """
 # Welcome to Gemini-Powered Stock Predictor RAG Chatbot!
 """)
 chatbot = gr.Chatbot()
 msg = gr.Textbox()
 clear = gr.ClearButton([msg, chatbot])
 def respond(message, chat_history):
 bot_message = queries(message)
 chat_history.append((message, bot_message))
 return "", chat_history
 msg.submit(respond, [msg, chatbot], [msg, chatbot])
 
demo.launch(share=True)

Query Time!

Let’s query the chatbot.

Question 1:

Tell me all the news about scam.
PNB scam fallout: Trade finance hit as caution prevails, premium soars
PNB scam fallout widens as NCLT bars over 60 entities from selling assets
PNB scam: CBI arrests 4 officials of Nirav Modi, Mehul Choksi’s companies
PNB scam: Court issues non-bailable warrants against Nirav Modi, Choksi
PNB chief Sunil Mehta admits the collusion of employees in LoUs scam
Nirav Modi case: CBI recovers LoU documents, arrests another PNB executive
Assets worth Rs 57.16 bn, 1st LoU issued in Mar 2011 seized: ED on PNB scam
‘Explain how scam took place’: CVC seeks report from PNB, FinMin in 10 days
PNB fraud case: Congress says it will oppose PSBs privatisation
PNB fraud: Govt thinks RBI may be unable to ensure effective supervision?
7 public sector bank stocks hit 52-week lows on Rs 114-bn PNB fraud case
PNB tanks 7% on Monday; m-cap plunges Rs 109.75 bn in four sessions
PNB will have to own responsibility of bona fide transactions, says FinMin
Nirav Modi PNB fraud: PM Modi’s silence speaks of his loyalties, says Rahul
Nirav Modi PNB fraud: Vijay Aggarwal to be absconding diamantaire’s lawyer
Before PNB fraud: Nirav Modi, Choksi left 18 businessmen, 24 firms bankrupt
Nirav Modi PNB fraud: CBI seals bank’s Brady House Branch in Mumbai
Once the pride of Punjab, PNB is now a theme for WhatsApp memes in India
Eyes wide shut: The $1.8 billion PNB fraud that went completely unnoticed
Bank Union wants CBI probe into PNB fraud, alleges RBI failed as regulator
CBI questions Nirav Modi’s CFO Vipul Ambani; searches PNB Mumbai branch
PNB scam: Jewellers expect more casualties and tighter regulations
Rs 114-bn PNB scam: Why fraudsters prefer to use Letter of Undertaking

Question 2:

What is the latest news about RBI?
The RBI has deferred the launch of IndAs again, awaiting amendments to the banking laws.

Question 3:

Can you tell me all the news related to home loans?
Lending rates cut leads to balance transfers in home loan market: ICRA
Can’t match home loan rates with BoB: United Bank of India
After rate cut, Bank of Baroda eyes bigger home loan pie
Bank of Baroda offers lowest home loans rates at 8.35%
For SBI, home loan queries jump three times since rate cut
HDFC cuts home loan rates by up to 0.45%
Hamara Ghar offer: SBI’s new home loan
Banks cut lending rates: How home-loan borrowers will benefit
SBI cuts lending rate by 90 bps; home, auto loans to become cheaper

Question 4:

Tell me about the news of Yes Bank Scam.
The RBI discovered more than $450 million in extra bad loans at Yes Bank. The gross non-performing assets assessed by the Reserve Bank of India were $457 million higher than Yes Bank had disclosed as of November 20, 2019.

Question 5:

What is the latest news about NBFC?
The latest news about NBFC is that RBI eases norms for banks to lend more to NBFCs, housing finance companies. RBI enhances single-borrower exposure limit to 15% of bank’s capital.

Conclusion

Building a real-time financial news RAG chatbot using the Indian Financial News dataset proved to be a rewarding journey. Now you know that you can easily ask questions about banks and get instant answers! It’s now time for you to make your own chatbot. Hope you enjoyed reading this blog.