Selecting the Right Legal Source: Using Langchain’s RouterChain for Dynamic Sourcing

When building an AI system for the legal domain, it is useful to enable it select between multiple data sources dynamically based on the input. For example, you may have texts of several cases and textbooks. Uploading them all into one folder, and searching with the same query through all of them might take longer and cost more.

To address that, Langchain’s RouterChain paradigm offers an elegant solution for creating chains that can dynamically choose which sub-chain to use for a given input.

Below is an example of a simple way to implement RouterChain.

First, let’s import all the libraries and packages we will need and add OpenAI API key:

from langchain.chains.router import MultiRetrievalQAChain  
from langchain.llms import OpenAI
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

Next, we’ll load and create retrievers for two legal cases: Oracle v Google, and Hawkins, and one for an excerpt from “Learning Legal Reasoning” textbook by Prof John Delaney.

howtobrief = TextLoader('delaney.txt').load_and_split()  
howtobrief_retriever = FAISS.from_documents(howtobrief, OpenAIEmbeddings()).as_retriever()

oracle = PyPDFLoader('oracle.pdf').load_and_split()
oracle_retriever = FAISS.from_documents(oracle, OpenAIEmbeddings()).as_retriever()

hawkins = PyPDFLoader('hawkins.pdf').load_and_split()
hawkins_retriever = FAISS.from_documents(hawkins, OpenAIEmbeddings()).as_retriever()

Now, we’ll describe what each retriever should be used for and initiate the MultiRetrievalQAChain:

retriever_infos = [
{
"name": "howtobrief",
"description": "Delaney guidelines on how to exctract facts from a legal case",
"retriever": howtobrief_retriever
},
{
"name": "oracle",
"description": "Good for understanding Oracle v Google case",
"retriever": oracle_retriever
},
{
"name": "hawkins",
"description": "Good for understanding Hawkins case",
"retriever": hawkins_retriever
}
]

chain = MultiRetrievalQAChain.from_retrievers(OpenAI(), retriever_infos, verbose=True)

Let’s test it:

print(chain.run("what are the Delaney guidelines on finding facts?")) 

From the response you may see some thinking process where the agent chose the Delaney textbook as the most appropriate to consult with:

> Entering new  chain...
howtobrief: {'query': 'What are the Delaney Guidelines on finding facts?'}
> Finished chain.
The Delaney Guidelines on finding facts are as follows: 1) Identify parties and their roles; 2) Identify plaintiff's cause of action at trial; 3) Identify trial court's disposition of cause-of-action; 4) Identify which party appealed to which court and the relief requested; and 5) Identify any prior action taken by an intermediate appellate court.

Let’s now query specific cases:

print(chain.run("what was oracle v google about?"))
> Entering new  chain...
oracle: {'query': 'What was the Oracle v Google case about?'}
> Finished chain.
The Oracle v Google case was about Google's use of 37 API packages from Oracle's Java software platform in its Android operating system.
print(chain.run("what did the court decide in hawkins case?"))
> Entering new  chain...
hawkins: {'query': 'what did the court decide in the Hawkins case?'}
> Finished chain.
The court decided that the jury was allowed to consider two elements of damage: (1) Pain and suffering due to the operation; and (2) positive ill effects of the operation upon the plaintiff's hand. The court also decided that the true measure of the plaintiff's damage was the difference between the value to him of a perfect hand or a good hand, such as the jury found the defendant promised him, and the value of his hand in its present condition, including any incidental consequences fairly within the contemplation of the parties when they made their contract. The court also decided that it was erroneous and misleading to submit to the jury as a separate element of damage any change for the worse in the condition of the plaintiff's hand resulting from the operation. Finally, the court decided that the defendant's requests for instructions were loosely drawn and were properly denied.

Langchain’s RouterChain approach allows you to dynamically create complex AI systems that analyze the legal input and determine which data sources and modules are most appropriate to leverage for a given example.

This flexibility and intelligence allows for more robust and broadly capable legal AI systems that can help grasp complex legal concepts faster when preparing for classes or exams.

--

--

legaltextai

Gentle introduction of AI tools into the legal education and practice.