Getting complete grasp over LLMs and Pinecone
Pinecone and Open AI can be combined to create a powerful and efficient search and recommendation system that serves users’ queries by leveraging the power of language models and vector similarity search. To achieve this, we can utilize OpenAI’s GPT-based models to generate embedding for text data, such as documents, articles, or user-generated content. These embedding can then be stored and indexed in Pinecone, a vector database optimized for high-dimensional similarity search. In today’s article we shall see how we can combine the 2 to make some exceptional AI based use-cases work.
What is Pinecone?
Pinecone is a vector database and search infrastructure designed to enable developers and data scientists to build large-scale, real-time machine learning applications. It allows users to manage, store, and search high-dimensional vectors efficiently. This helps in a wide range of applications, such as similarity search, recommendations, personalization, and clustering, among others.
Setting up Pine cone
After creating an account on pinecone now you can create index where you will be able to store vector
You will be provided your API key on the left section API Keys retrieve your API value and environment
Now you can call your API in python like so replace key with your value and environment with yours and we will be ready saving vectors to pinecone
def pinecone_api():
api_key = 'YOUR VALUE HERE'
pinecone.init(
api_key=api_key,
environment="YOUR ENVIRONMENT HERE"
)
What are Embedding?
Embedding(Vector) are created to represent data such as words, sentences, or images in a high-dimensional vector space, where each dimension of the vector corresponds to a feature of the data.
Embedding are widely used in machine learning and natural language processing applications to improve the accuracy of the models and reduce computational complexity.
Have you ever noticed when you finish watching a movie you get recommendations for what you might like to watch next? This is done by embedding used in recommendation systems, where they are used to represent items and users in a high-dimensional vector space. By analyzing the distance and similarity between the embedding of different items and users, the recommendation system can provide personalized recommendations to users.
We will be using Open AI per-trained model that converts text to vectors.Let’s take a look at the following example:
def gpt3_embedding(content, engine='text-embedding-ada-002'):
response = openai.Embedding.create(input=content,engine=engine)
vector = response['data'][0]['embedding']
return vector
We use text-embedding-ada-002 model to convert the given content to a vector . We can extract only the embedding from the response.
How to save embedding in pinecone database?
To store vectors in the pine cone database we need to create the following:
- ID
- Embedding
- Meta Data
Meta Data:
Meta Data is additional information associated with each embedding that provides context about the data and helps to facilitate efficient search and retrieval. We store metadata in Pine cone along with embedding.Let’s take a look how this is done.
Here we have a data of orders from different customers in JSON. We assign id’s to each order,create its embedding and pass the text of data that includes customer’s name,address,items they bought price etc. as the meta data.
def meta_data_creation(orders):
orders_embeddings= []
orders_ids = []
orders_metadata = []
for order in orders:
orders_ids.append(order['order_id'])
embedding = gpt3_embedding(order)
orders_embeddings.append(embedding)
metadata = {
"ordertext":order
}
orders_metadata.append(metadata)
return orders_ids,orders_embeddings,orders_metadata
Upsert to Pinecone
Create an object to the Index class of pinecone.
pinecone_index = pinecone.Index(PINECONE_INDEX_NAME)
upsert_vector = list(zip(meta_id,embeddings,meta_data))
pinecone_index.upsert(vectors = upsert_vector)
What is an LLM?
LLM stands for Large Language Model, which is a type of AI model designed to understand and generate human-like text.
These models are trained on massive amounts of text data from the internet or other sources, and they learn to recognize patterns, relationships, and context in the data.LLMs have demonstrated remarkable capabilities in various tasks, including:
- Text generation,
- Text completion
- Translation
- Sentiment analysis
- Question answering
Some popular examples of LLMs include OpenAI’s GPT-3 (Generative Pre-trained Transformer 3), Google’s BERT (Bidirectional Encoder Representations from Transformers), and T5 (Text-to-Text Transfer Transformer).
User’s query
Let’s assume user wants to know how many items did customer 1 buy?
Create Embedding of query
Take the prompt of the user, create it’s embedding by using the gpt3_embedding code mentioned and send the query to pine cone like so make sure to include meta data and get_scores so we get the metadata and similarity score as well.
results = pinecone_index.query(vector=users_embeddings, top_k=5, includeMetadata=True,get_scores = True)
top_k refers to the top-k most similar results. In above code top_k=5, we are requesting the 5 most similar items to the input query vector.
The similarity scores are based on the distance metric we have chosen when creating the Pinecone index.The scores help us understand the level of similarity between the query vector and the returned results. By setting a similarity threshold, we can filter out results that are not similar enough to the query vector. This can improve the relevance and quality of the results and help reduce the number of false positives.
Response from pinecone
The query method returns a dictionary that contains matches key. The matches value is a list of dictionaries, with each dictionary representing id as key with its value and metadata as a key with its value.
We can now extract the required content or text from the metadata and use it as context along with query of user to pass it to a LLM model
Constructing a prompt for completion with Prompt engineering
Constructing a prompt with prompt engineering that includes the query along with the results as context obtained from pinecone results.Prompt engineering can make a significant difference in the results obtained from a language model, as it can help you obtain more precise or specific answers and even control the verbosity of the response.
GPTPrompt = "Based on the context below answer the following query" + Your query + " Answer as accurately as possible.You can find the answers in the following context " + results.
Pass this prompt to OpenAI completion model such as ‘Text-davinci-003’ or ‘Gpt-3.5-turbo’
res = openai.Completion.create(
engine="text-davinci-003",
prompt =GPTPrompt,
temperature=0,
max_tokens=1000,
)
response = res['choices'][0]['text']
Get the response from the above OpenAI api that uses the ‘text-davinci-003 ’ to answer the user’s query based on the nearest results retrieved from pinecone.
Conclusion
In conclusion, using Pinecone and OpenAI GPT to create completions and answer user queries is a powerful tool that can enhance user experience and improve the efficiency of various tasks. With Pinecone’s fast and efficient search capabilities and OpenAI GPT’s advanced language processing abilities, users can create powerful applications that can understand natural language and provide accurate and relevant responses to user queries. By following the steps outlined in this article, users can start building their own custom AI applications and take advantage of the power of Pinecone and OpenAI GPT. So why wait? Start exploring the possibilities today and see how these tools can transform your application development process!