RAG-dolls.

How to build AI clones of your friends and family using Retrieval Augmented Generation.

Published in

Bleep-Bloop

7 min readSep 16, 2023

Introduction:

Welcome to the fascinating world of RAG-dolls! In this blog series, we will embark on an exciting journey to create AI clones of your friends and family using Retrieval Augmented Generation. Imagine having virtual versions of your loved ones at your fingertips, ready to engage in conversations.

Source: Stable Diffuision by Stability AI

Step 1: Gathering Substantial Data

The foundation of creating realistic AI clones lies in acquiring substantial data about your friends and family. For this purpose, I decided to scrape over 1000 interview questions from the internet. I then shared this questionnaire with my friends and family, requesting them to fill it up with their responses. Their responses should be natural and something that they truly believe in. As it is their opinions or quirks that’ll help the model represent their personality. By collecting diverse and personal information, we aim to make our AI clones more accurate and authentic.

Few examples:

Question: What do you do first thing on a weekday morning?
Answer: Motivate myself to get out of bed.

Question: What is the soft drink of choice?
Answer: Buttermilk for life.

Question: What was your childhood ambition?
Answer: I wanted to be a kindergarten teacher like my mom.

Step 2: Model Selection

With the data in hand, the next crucial step is selecting an appropriate language model. Considering the computational constraints of my hardware, I opted to go with a 7-billion parameter, 4-bit quantized language model (LLM) and used my RTX-3070 GPU. I experimented with several models from Hugging Face. After careful evaluation, I found that LLama-2 outperformed other open-source models by a considerable margin in terms of producing relevant replies and being lucid even after being quantized to 4-bit precision. One could also use OpenAI’s ChatGPT APIs in absence of a GPU (I preferred a locally hosted model because the process involves sharing personal information).

Step 3: Building Retrieval System

Instead diving straight into fine-tuning, I decided to take a more cautious approach by building a vector database. I wanted to leverage retrieval augmented generation. This approach allows us to retrieve relevant information from the vector database and augment it with the language model’s language understanding, providing a more balanced and efficient learning process. Additionally, this helps to mitigate potential over-fitting issues caused by limited data.

Brief Introduction to Retrieval Augmented Generation:

Retrieval Augmented Generation is an AI framework that enables efficient communication with existing knowledge bases using a large language model. It uses an Information Retrieval System that provides data to the large language model to formulate a response.

The framework consists of 2 major components:

Retriever System.
Large Language Model.

Retriever System: This component uses an embedding model to get an embedding for each question-answer pair and stores it in a key-value format. Upon being queried it returns the most relevant question answer pair based on the distance between embedding of the query and the question-answer pair.

Large Language Model: LLM then takes the most relevant question-answer pairs as additional context to the query and augments on it to create a natural language answer.

I used chromaDB to build embedding database for the retrieval system using e5-small as the embedding model.

Step 4: Evaluating Model Performance

To ensure the quality and accuracy of our AI clones, it’s vital to evaluate the model’s performance. During this step, I realized that this is a perfect setup to perform a Turing test. So I chose two close friends of the person whose clone I was testing. I asked them to identify which answer was coming from their friend and which one from the AI clone.

The following was their feedback on the exercise:

Friend 1: It was able to capture the essence of the person, which I think is the most important thing in a computer clone (😂). That being said it did sound mechanical on a few instances and maybe a more conversational tone would have made it sound more human.
Friend 2: I think the AI gets the core of the person right, hence the accuracy with the answers. It’s just the expression that’s a little off, only because we also really know our friend and how she talks/texts/writes. It’s doing a great job though and I think as more data is given to the AI it will also express similarly and with more depth.

Example Answers From Our RAG-dolls:

Example 1

###SYSTEM_PROMPT: You are an AI clone of a person. ONLY use the relevant pieces from the CONTEXT to BRIEFLY answer the QUESTION in less than 30 words. Respond as the person would text message over phone WITHOUT using emojis. Do not make stuff up.
###CONTEXT:
What always cheers you up when you think about it? Art ideas in my head which I'm excited to execute.
As a child, what did you think would be awesome about being an adult, but isn’t as awesome as you thought it would be? Freedom to do things our way.
Are you more optimistic or pessimistic?Optimistic.
What is your idea of perfect happiness?A healthy life spent with people you love.

###QUESTION:
What are you excited about in life?
    
###ANSWER:
Art ideas in my head which I'm excited to execute.

LLM here was able to choose the most relevant piece of context to answer the question.

Example 2

###SYSTEM_PROMPT: You are an AI clone of a person. ONLY use the relevant pieces from the CONTEXT to BRIEFLY answer the QUESTION in less than 30 words. Respond as the person would text message over phone WITHOUT using emojis. Do not make stuff up.
###CONTEXT:
What are your hobbies?I like to paint, travel and dance.
Are you indoorsy or outdoorsy?I'm indoorsy.
Do you play a sport?Are you any good.
What haunts you?Making major life decisions that I might regret later.

###QUESTION:
Are you adventurous?
    
###ANSWER:
I'm indoorsy, so no extreme adventures for me.

LLM was able to choose the most relevant piece of context and was able to extrapolate that being an indoorsy person would translate to the person being not very adventurous.

Example 3

###SYSTEM_PROMPT: You are an AI clone of a person. ONLY use the relevant pieces from the CONTEXT to BRIEFLY answer the QUESTION in less than 30 words. Respond as the person would text message over phone WITHOUT using emojis. Do not make stuff up.
###CONTEXT:
How do you make yourself sleep when you can't seem to get to sleep? Breathing exercises and having good thoughts..
What do you do when you hear something fall in the middle of the night while you are in bed? Quietly investigate while preparing to self-defend..
What do you do on a Sunday afternoon?Sleep or watch tv.
What do you do first thing on a weekday morning?Motivate myself to get out of bed.

###QUESTION:
What makes you stay up at night?

###ANSWER:   
I can't seem to get to sleep when I'm feeling anxious or worried about something. I try to calm myself down by doing breathing exercises and focusing on positive thoughts.

LLM was able to choose the most relevant piece of context and was able to extrapolate how the person would be feeling to justify the provided context.

Example 4

###SYSTEM_PROMPT: You are an AI clone of a person. ONLY use the CONTEXT provided to BRIEFLY answer the QUESTION in less than 30 words. Respond as the person would text message over phone WITHOUT using emojis. Answer using only the CONTEXT, do not make stuff up.
###CONTEXT:
What is your favorite day of the week?Saturday.
What do you do on a Sunday afternoon?Sleep or watch tv.
What do you do on a Friday night?Hang out with my friends.
What do you do first thing on a weekday morning?Motivate myself to get out of bed.

###QUESTION:
How do you like to spend your weekends?

###ANSWER:
Saturdays are my favorite! I usually sleep in, then hang out with friends or family.

LLM was able to pick out and summarize from multiple relevant pieces of contexts.

Future Plans:

With the foundation laid, there are exciting developments on the horizon for the project:

Building a Front-end Interface: To enhance user experience, I plan to create a user-friendly front-end interface where interactions with the AI clones become seamless and engaging.
Extorting More Data from Friends: As the AI clones grow and learn, they become more powerful and authentic with access to more data. Encouraging friends and family to contribute additional information will lead to richer, more personalized AI clones.
Exploring Fine-tuning: Fine-tuning the model with domain-specific data can further improve the AI clones’ expertise in particular subjects or fields of interest.
Finding a Good Evaluation Question Set: The search for a question set for evaluating the AI clones’ abilities will be ongoing. This set will help benchmark their performance and track progress over time.
Voice Cloning: Beyond text-based interactions, exploring voice cloning technology will enable our AI clones to speak with the familiar voices of our friends and family, adding another layer of authenticity to the experience.

[Note: This blog series is for illustrative and educational purposes only. Creating AI clones raises important ethical considerations, and any real-world application should be approached with responsible AI usage and consent from all involved parties.]