From RAGs To (Psychotherapy) Riches

Exploring the Revolutionary Impact of Retrieval Augmented Generation on Therapeutic Practices

Published in

Eleos Health

7 min readApr 16, 2024

Imagine walking into a new library you’ve never visited before. This library is special though: it stores all the knowledge, insights, and wisdom accumulated over countless psychotherapy sessions. Each book includes comprehensive data about every therapy session ever conducted—detailing its exploration of human emotions and its impact in the journey toward healing.

Now, imagine you have a question about a specific therapeutic intervention, therapy type, or topic. How would you even START to search for an answer?

Well, you could start by reading each book one by one, but that would take close to an eternity. But what if you had a magical tool that could instantly scan through all the books, pull out the information most relevant to your question, condense it all into a more digestible format, and use it to generate a coherent, insightful answer for you.

And… what if I told you this magical tool indeed exists, and it’s called Retrieval Augmented Generation (RAG)?

The massive psychotherapy library — a (cute) illustration

RAG: The Magical Context Tool

RAG is essentially a (somewhat) new way to retrieve relevant context and then use it to generate text quickly — like, very quickly. In simpler terms, it’s like having a lightning-fast librarian who can read all the books in the library in a split second, understand exactly what you are looking for, and serve up the best information in an easy-to-understand format.

Turning Text into Numbers

The first step in the RAG pipeline is converting vast amounts of textual data into smaller “pieces” of text, which then are converted into more manageable — but still meaningful — numeric forms known as vectors. This process of “chunking” and then “embedding” is crucial for storing these vectors in a database designed for quick information retrieval.

But why transform text into numbers in the first place? The answer lies in the efficiency of numerical operations. When we translate chunks of text into numeric equivalents, we can apply mathematical operations — such as addition, subtraction, multiplication, and division — that are impractical with raw text.

Creating Meaning Through Math

Importantly, because embeddings capture not only the actual characters of text, but also their inherent meanings - the outcomes of any mathematical operations performed with them are similarly rich in semantic content. For instance, the numeric representation of “H2O” combined with that of “liquid” closely approximates the representation of “water.” Adding the same “H2O” representation to “solid” yields the numeric equivalent of “ice.”

This almost-magical ability to encode meaning into operable numbers is exactly what enables the RAG pipeline to execute rapid, meaning-based similarity assessments — the “R,” or “Retrieval,” part of RAG — unlocking a wide array of possibilities for processing and understanding text at remarkable speeds.

Going back to the library example, you can think of a vector as a condensed version of a book in the gigantic psychotherapy library. It’s like a summary that captures the essence of the book — but in a much smaller, more usable format.

Scanning for Contextual Similarities

Once we convert psychotherapy sessions and their accompanying documentation (e.g., clinical notes and treatment plans) into rapidly searchable formats, we’re able to efficiently identify parallels among a large number of past cases. These parallels refer to cases that share the closest numeric representations, indicating a high degree of similarity in meaning. By implementing a parameter, known technically as “top K,” we can narrow our exploration to a specific number of the most similar previous sessions that offer valuable context. For example, if we used “2” for “top K,” then we would identify the two most similar previous sessions.

Applying RAG to Therapy Treatment Plans

Consider the scenario described in the figure below, where a therapist concludes their first session with a new client and sets “reducing substance abuse” as the main goal for their treatment plan going forward. To pinpoint more precise and actionable objectives assigned to this main goal, the therapist enters this rather broad goal into the “treatment plan” section of their documentation, along with the inquiry: “What actionable objectives should I set for this client?”

Leveraging RAG technology, the system can quickly identify two closely aligned treatment plans from two completely different cases based on their similar-intent primary goals: “minimize daily alcohol consumption” and “reduce reliance on prescription pain medication.”

The next step is where this process becomes especially innovative: using these similar-goal treatment plans, we can now easily extract the associated objectives and possibly even learn the extent of their success — information that serves as a detailed and highly informative context for a Large Language Model (LLM).

Presented with the therapist’s initial query and the context offered by these previous similar treatment plans, the LLM now draws on concrete examples rather than generating responses “from scratch.” This approach grounds the model’s suggestions in proven strategies, significantly enhancing the relevance and reliability of the objectives it proposes for the current client.

A theoretical framework for using RAG to assist with the setting of psychotherapeutic treatment plans

Simple Magic (Or is it…?)

Once you grasp the transformative potential of RAG, it’s tempting to envision its application universally and instantly — to imagine a world where RAG empowers literally every aspect of the intersection of AI and psychotherapy (or any other field). However, the reality of implementing this innovative method is filled with complexities. Here are just a few of the challenges that come with adapting RAG for diverse psychotherapy use cases:

Unintentional Transfer of Session-Specific Information

The goal of RAG is not to replicate the style or content of previous sessions or notes verbatim — or worse, to insert identifiable information from previous sessions — but rather to draw “inspiration” from them in a way that enhances the relevance and uniqueness of the current session. For instance, going back to our earlier example, rather than directly borrowing the goal to “consult a pain management specialist” from a previous session, RAG should adapt that goal to fit the current substance abuse context by changing it to something like, “consult a specialist in addiction.”

Achieving this delicate balance requires sophisticated prompt engineering, along with strategies for generalization and de-identification of previous session notes. It also necessitates vigilant and thoughtful management of vector databases to ensure that each session remains distinct and tailored to the individual’s current needs and objectives.

2. Overly Broad Queries

When the query presented is overly broad, the resulting context can be too expansive. Imagine a therapist asking a bot to “provide a comprehensive overview of all therapeutic approaches for treating depression.”

One approach to navigating this challenge is to employ a hierarchical approach to RAG (sometimes known as HRAG), which allows the system to methodically and progressively drill down into more and more specific topics. This process effectively narrows the breadth of sessions (i.e., the context) under consideration. Eventually, this refined retrieval and selection of documents becomes sufficiently compact to be directly integrated into the LLM without the need for further filtration, ensuring that the responses are not only relevant, but also deeply informed by a focused subset of the available data.

3. Highly Similar Contexts

In instances where the available session notes or other contexts are almost identical, it’s crucial to “inject” a diversity of perspectives into the LLM to create a broader range of inspiration and insights. In such cases, instead of relying on a typical RAG process that might select a small number of very similar documents — like two or three, for example — we could expand the selection to a larger set (or a larger “top K” value). For example, perhaps we use “20” instead of “2.”

Then, before we employ the “G” part of the RAG process (i.e., “Generation”), we can distill this set of 20 previous sessions down into a smaller subset of sessions — maybe two or three — that are the most distinct and different from each other.

This strategy ensures that the LLM receives a variety of inputs, enriching the creative and therapeutic potential of the responses it generates. By prioritizing diversity in the selection process, we can avoid the “echo chamber effect” and, in a sense, “force” a richer, more nuanced interaction with the LLM.

The Necessity of the Human Touch

While RAG and other innovative AI technologies can enhance the practice of psychotherapy, it’s important to remember that they are just tools. They are not meant to replace the human touch, empathic understanding, and therapeutic relationship that are at the heart of psychotherapy. Instead, they are meant to augment and enhance the therapist’s abilities — to provide them with the information they need when they need it, ultimately helping them be more effective in their work.

In the end, psychotherapy is about helping people — about understanding their struggles and guiding them toward healing. And while cutting-edge tools like RAG can indeed help providers do this more effectively, the real magic lies in the human heart and mind. And no amount of AI can replace that.

From RAGs To (Psychotherapy) Riches

Exploring the Revolutionary Impact of Retrieval Augmented Generation on Therapeutic Practices

RAG: The Magical Context Tool

Simple Magic (Or is it…?)

The Necessity of the Human Touch

Written by Amit Spinrad