Context is Key: The Significance of RAG in Language Models

Abhinav Kimothi
4 min readDec 3, 2023

30th November, 2022 will be remembered as the watershed moment in artificial intelligence. OpenAI released ChatGPT and the world was mesmerised. Interest in previously obscure terms like Generative AI and Large Language Models (LLMs), was unstoppable over the following 12 months.

Google Trends — Interest Over Time (Nov’22 to Nov’23)

The Curse Of The LLMs

As usage exploded, so did the expectations. Many users started using ChatGPT as a source of information, like an alternative to Google. As a result, they also started encountering prominent weaknesses of the system. Concerns around copyright, privacy, security, ability to do mathematical calculations etc. aside, people realised that there are two major limitations of Large Language Models.

The curse of the LLMs

Users look at LLMs for knowledge and wisdom, yet LLMs are sophisticated predictors of what word comes next.

The Hunger For More

While the weaknesses of LLMs were being discussed, a parallel discourse around providing context to the models started. In essence, it meant creating a ChatGPT on proprietary data.

The Challenge

  • Make LLMs respond with up-to-date information
  • Make LLMs not respond with factually inaccurate information
  • Make LLMs aware of proprietary information

Providing Context

While model re-training/fine-tuning/reinforcement learning are options that solve the aforementioned challenges, these approaches are time-consuming and costly. In majority of the use-case, these costs are prohibitive.

In May 2020, researchers in their paper “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” explored models which combine pre-trained parametric and non-parametric memory for language generation.

So, What is RAG?

In 2023, RAG has become one of the most used technique in the domain of Large Language Models.

User writes a prompt or a query that is passed to an orchestrator

Orchestrator sends a search query to the retriever

Retriever fetches the relevant information from the knowledge sources and sends back

Orchestrator augments the prompt with the context and sends to the LLM

LLM responds with the generated text which is displayed to the user via the orchestrator

How does RAG help?

Unlimited Knowledge

The Retriever of an RAG system can have access to external sources of information. Therefore, the LLM is not limited to its internal knowledge. The external sources can be proprietary documents and data or even the internet.

Expanding LLM Memory with RAG

Confidence in Responses

With the context (extra information that is retrieved) made available to the LLM, the confidence in LLM responses is increased.

Increasing Confidence in LLM Responses

As RAG technique evolves and becomes accessable with frameworks like LangChain and LlamaIndex, it is finding more and more application in LLM powered applications like QnA with documents, conversational agents, recommendation systems and for content generation.

If you’re someone interested in generative AI and Large Language Models, let’s connect on LinkedIn — https://www.linkedin.com/in/abhinav-kimothi/

Also, please read a free copy of my notes on Large Language Models — https://abhinavkimothi.gumroad.com/l/GenAILLM

Read my other blogs —

WRITER at MLearning.ai / 48K+ GPTs / Free TURBO AI Art Tools

--

--

Abhinav Kimothi

Co-founder and Head of AI @ Yarnit.app || Data Science, Analytics & AIML since 2007 || BITS-Pilani, ISB-Hyderabad || Ex-HSBC, Ex-Genpact, Ex-LTI || Theatre