Building a RAG-Enhanced ChatGPT for Legal Research: A Case Study on the Case of Attorney-General v MCP & Others (1995)

Published in

TechMalawi

7 min readJan 28, 2024

The world of artificial intelligence(AI) has been revolutionized by the emergence of Large Language Models (LLMs), which are taking the industry by storm. These advanced AI systems, such as ChatGPT, are designed to understand and generate text that is similar to human language. Trained on massive amounts of data, LLMs have proven to be incredibly proficient in tasks related to language comprehension and generation.

However, despite their impressive performance on the data they were trained on, LLMs can sometimes fall short when it comes to deep knowledge beyond their training material. This is where innovative techniques like Retrieval-augmented generation (RAG) come into play.

RAG offers a solution to this limitation by incorporating facts fetched from external sources, empowering generative AI models to deliver more accurate and reliable results with broader scope and depth.

Allow me to illustrate this concept with an example from the world of law. Imagine judges and lawyers in need of citations to strengthen their arguments. These citations might come from various legal documents, which may be in PDF or other formats, or from past rulings in similar cases. Locating and referencing this information can be a daunting task, but this is where RAG truly excels.

Let’s take a look at the well-known Malawi Congress Party (MCP) v President of the Republic of Malawi case, which I learned about in my Business Law class in college, (shoutout to my lecturer, lol). This case serves as an excellent example of how RAG can be utilized to efficiently locate and reference relevant information, making it easier for legal professionals to build stronger arguments and arrive at more informed decisions.

It’s important to note that while the details of the case are certainly interesting, the primary goal of this article is to demonstrate how RAG can be utilized to develop a useful tool for legal professionals. So if you intrested to know more about this case you can read it from here.

So, let’s get started on building our Chatbot.

Here’s a step-by-step guide for building our RAG-powered chatbot:

The first step is to obtain the PDF document containing the judge’s ruling on the case. This document will serve as the primary source of information for our Chatbot.
Next, we’ll need to use a technique called “document splitting” to break down the PDF document into smaller, more manageable units.
Once the document has been split, we’ll use a “vector store” to store the document’s contents in a format that can be easily retrieved.
Finally, the Chatbot uses a Retriever to fetch relevant document parts from the vector store based on user input.
ChatGPT generates a response using a prompt that includes the question and retrieved data, enabling it to provide accurate information from the Judge’s ruling document.

To follow along with the code for this project, you’ll need to have the following:

An API key for OpenAI, which will be used to access the ChatGPT necessary language models and processing tools.
Python 3 installed on your system.
Langchain LLM framework.

Great, now that we have our tools ready, let’s dive into the code.

Step 1: Document Loading

First lets install LangChain that will help us to connect to ChatGPT

pip install langchain

To load the document, we will use PyPDFLoader from LangChain (make sure you’ve installed the PyPDF package beforehand). If you don’t have it, then you can install it using pip, as shown below

pip install PyPDF2

Import the PDF document by changing the path according to your requirements.

from langchain.document_loaders import PyPDFLoader

# Replace 'path_to_your_document' with the actual path to your PDF document
document_path = 'path_to_your_document'

# Load the document using PyPDFLoader
loader = PyPDFLoader(document_path)
docs = loader.load()

Step 2: Document Splitting

Now that we’ve loaded the PDF document, let’s split it into smaller, more manageable chunks. We’re doing this because the document is too large to pass entirely in one prompt due to ChatGPT’s token limit, and it’s not cost-effective either.

To split the document, let’s import RecursiveCharacterTextSplitter

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

len(all_splits)

Here’s what each argument means in the RecursiveCharacterTextSplitter:

chunk_size: The maximum number of characters in each chunk.
chunk_overlap: The number of characters from the end of one chunk that will be repeated in the beginning of the next chunk. This ensures that important context is not lost between chunks.
add_start_index: If set to True, adds an index for each chunk indicating its position in the original document.

len(all_splits) shows you how many chunks the document has been splited to and for this case it will be 260

Step 3: Vector Store

After splitting the document into smaller chunks, the next step is to store these chunks in a vector store for efficient retrieval and comparison of the document’s content. When a user inputs a query to the chatbot, we need to retrieve the most relevant chunks from the split documents. This is achieved by embedding the user’s query in a vector store and performing a cosine similarity search against the indexed 260 text chunks, allowing us to quickly search through them at runtime.

To implement this in code, we’ll start by loading the OpenAI Key from the .env file using the Python ‘dotenv’ package. Ensure that the .env file is located in the same folder as your Jupyter-notebook. Here’s the code:

import dotenv
dotenv.load_dotenv()

Next, import the necessary libraries, Chroma and OpenAIEmbeddings.

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

Create a vector store using this code below

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

Step 4: Retrieval

Now that we have our document split and stored in a vector store, we can perform a similarity search to retrieve relevant chunks based on a user’s query. We’ll use the as_retriever method from our vector store to create a retriever with the similarity search type. Here’s the code:

retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})
retrieved_docs = retriever.invoke("What did the Attorney General say about Quroam?")

We query using this question “What did the Attorney General say about Quroam?” now lets inspect to see what the first retrieved chunk contains.

print(retrieved_docs[0].page_content)

If we inspect the first retrieved document by printing its page content, we can see that the relevant information has been successfully retrieved from the PDF document.

Step 5: Generate

Finally, let’s use the powerful GPT-4 language model to generate a response based on the retrieved information. We’ll use a custom prompt from the LangChain Hub to enhance the context of the generated response.

First lets import all the needeed Classes from our installed packages:

from langchain_openai import ChatOpenAI
from langchain import hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

Lets start by bringing in the Large Language Model to be used in generating the output:

llm = ChatOpenAI(model_name="gpt-4", temperature=0)
prompt = hub.pull("rlm/rag-prompt")

Lets take a look at the final prompt that will be sent to ChatGPT by writing this code:

example_messages = prompt.invoke(
    {"context": "filler context", "question": "What did the Attorney General say about Quroam?"}
).to_messages()
print(example_messages[0].content)

The prompt that will actually go to ChatGPT will contain the question that the user entered and the context retrieved from the relavent chunks of our document.

Now lets put it all together

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

The actually user’s input will go here

for chunk in rag_chain.stream("What did the Attorney General say about Quroam?"):
    print(chunk, end="", flush=True)

With these steps, you’ve successfully built a RAG-powered Chatbot that can retrieve and generate accurate information based on PDF documents.

Conclusion

This article demonstrates how to create a custom chatbot using ChatGPT that answers users’ questions based on a PDF document. However, it’s important to note that while building a simple prototype like this with a few lines of code is easy, developing a production-ready, RAG-powered chatbot requires numerous iterations. It may also involve more complex steps to address issues such as hallucinations, which go beyond the scope of this article. Nonetheless, I believe this serves as an excellent starting point for many organizations and individuals interested in RAG. For further learning, I recommend some short courses from DeepLearning.AI, like “LangChain: Chat with Your Data” and “LangChain for LLM Application Development.”

To learn more please check out my sources below.

1. Introduction | 🦜️ Langchain, https://python.langchain.com/docs/get_started/introduction. Accessed 21 January 2024.

2. Judicial Review Cause. “Malawi Congress Party v President of the Republic of Malawi (Judicial Review Cause 34 of 2020) [2021] MWHC 39 (2 June 2021).” MalawiLII, https://malawilii.org/akn/mw/judgment/mwhc/2021/39/eng@2021-06-02. Accessed 26 January 2024.

3. Nvidia Blog. “What Is Retrieval-Augmented Generation, aka RAG?” Nvidia, Nvidia. Accessed 25 01 2024.