Building a Query Engine with Pinecone and Langchain: A Comprehensive Guide

5 min readJun 13, 2023

In the vast realm of text-based applications, the ability to efficiently search and retrieve relevant information is crucial. Imagine a scenario where you have a vast collection of documents and you need to find the most relevant ones based on a user query. This is where query engines come into play, enabling us to navigate through large volumes of text data with remarkable speed and accuracy.

In this comprehensive guide, we will embark on an exciting journey to build our very own query engine using two cutting-edge technologies: Pinecone and LangChain. Pinecone empowers us with its state-of-the-art similarity search capabilities, allowing us to effortlessly find documents that closely match a given query. On the other hand, LangChain equips us with advanced text generation techniques, enabling our query engine to generate meaningful and context-aware responses.

Introducing Pinecone: Understanding the Concept of Similarity Search

In the world of data-driven applications, the ability to quickly and accurately search for similar items is crucial. This is where Pinecone, an innovative platform specializing in similarity search, comes in. Similarity search is all about finding items that share commonalities with a given query. Pinecone takes a unique approach by using advanced vector-based representations. Instead of relying on traditional methods like keyword matching, Pinecone transforms data into high-dimensional vectors. These vectors capture the essence of each item’s characteristics, making it efficient to compare and retrieve items based on their similarities. Understanding the concept of similarity search helps us harness the power of Pinecone to enhance search capabilities across a variety of applications.

Introducing LangChain: Bridging the Gap between Data Retrieval and Natural Language Generation

Langchain significantly enhances the query engine’s capabilities by incorporating advanced text generation and language processing features. While the query engine, like Pinecone, excels in similarity search and retrieving relevant results based on vectorized data, Langchain takes it a step further. With Langchain, we can generate meaningful and contextually relevant text responses that go beyond simple keyword matching. By utilizing models like ChatOpenAI, Langchain enables natural language understanding and generation, allowing for more interactive and informative search experiences. It bridges the gap between data retrieval and human-like responses, resulting in a more comprehensive and engaging user experience.

Setting Up the Foundations: Importing Packages and Configuring Environment Variables

To utilize Pinecone’s capabilities, we need to install and configure the Pinecone Python SDK. This involves installing the necessary packages and libraries, setting up the API key and environment variables, and ensuring compatibility with the desired programming environment.

!pip install openai langchain llama_index==0.6.23
!pip install pinecone-client
!pip install transformers

Now, we can import the necessary packages and set up the API key and environment variables needed for Pinecone, and OpenAI:

import os
import requests
import openai
import pinecone
from pathlib import Path

#Setup Pinecone API key as an environment variable.
api_key = "your-api-key-here"
environment = "your-api-key-hereyour-api-key-here"

os.environ['PINECONE_API_KEY'] = api_key

# Set your OpenAI API key as an environment variable.
os.environ['OPENAI_API_KEY'] = "your-api-key-here"
openai.organization = "your-org-here"

openai.api_key = os.getenv("OPENAI_API_KEY")

Initializing and Creating a Pinecone Index for Efficient Data Retrieval:

First, we must initialize pinecone. Once Initialized, we proceed to create a new index with specific parameters. In this case, we set the dimension of the index to 1536 and utilize cosine similarity as the metric for comparing vectors. By configuring the index in this manner, we optimize the performance of our data retrieval operations.

from llama_index.vector_stores import PineconeVectorStore
from llama_index.storage.storage_context import StorageContext
from llama_index import (
    LLMPredictor,
    ServiceContext,
    PromptHelper,
    load_index_from_storage,
    download_loader,
    VectorStoreIndex, 
    SimpleDirectoryReader
)

api_key = "your-api-key-here"
environment = "your-enviorment"

index_name = "pinecone-tutorial"

pinecone.init(api_key=api_key, environment=environment)

if index_name not in pinecone.list_indexes():
  pinecone.create_index(name=index_name, dimension=1536, metric="cosine")

index = pinecone.Index(index_name)

Now that we have our index created, we can now Vecotrize our documents into our new Index using the GPTStoreIndex:

file_names = os.listdir('/content/pdf')

# Dictionary to store the indices
indices_dict = {}

for file_name in file_names:
    # Get the document ID by removing the file extension
    document_id = os.path.splitext(file_name)[0]
    # Use document_id as Pinecone title
    pinecone_title = document_id

    metadata_filters = {"name": document_id}  # Replace with appropriate metadata filters
    vector_store = PineconeVectorStore(
         index_name=index_name,
          environment=environment,
          metadata_filters=metadata_filters
    )
    storage_context = StorageContext.from_defaults(vector_store=vector_store)

file_paths = Path('/content/pdf')
pdf_paths = Path(file_paths).glob('*.pdf')
for pdf_path in pdf_paths:
    try:

      # In this particuler case we Load data froma directory containing pdf files, So we used the PDFReader loader. 
      PDFReader = download_loader("PDFReader")
      loader = PDFReader()
      aa_docs = loader.load_data(file=pdf_path)
      print(f"Loaded document from {pdf_path}")
    except Exception as e:
      #Skips any files that may cause error while loading.
      print(f"Error reading PDF file: {pdf_path}. Skipping... Error: {e}")

      # Create the GPTVectorStoreIndex from the documents
indices_dict[document_id] = GPTVectorStoreIndex.from_documents(aa_docs, storage_context=storage_context)
indices_dict[document_id].index_struct.index_id = pinecone_title

Enhancing Query Engine Capabilities with Advanced Text Embeddings and Intelligent Query Generation with LangChain:

By leveraging LangChain’s advanced language models, such as ChatOpenAI, the query engine can generate natural and coherent text that effectively addresses user queries. With LangChain, the possibilities for enhancing the query engine’s capabilities are virtually limitless, enabling more meaningful interactions and improved user satisfaction.

Below we define a data querying function, which we are passing the input text parameter through:

# This will allow to query a response without having to load files repeatedly.
def data_querying(input_text):

    # We Must reinitialize Pinecone in oder to load our previously created index.
    api_key = "your-api-key-here"
    environment = "your-enviorment-here"

    os.environ['PINECONE_API_KEY'] = api_key

    index_name = "pinecone-tutorial"
    pinecone.init(api_key=api_key, environment=environment)

Once we reinitialize and load our Pinecone Index; we create a model to build our embeddings. In this case we use OpenAI’s “text-embedding-ada-002” API. There are some free alternatives. “all-MiniLM-L6-v2” is a good one!

model_name = 'text-embedding-ada-002'

embed = OpenAIEmbeddings(
      model=model_name,
      openai_api_key='your-api-key-here'
    )
    # This text field represents the field that the text contents of your document are stored in
    text_field = "text"

    # load pinecone index for langchain
    index = pinecone.Index(index_name)

    vectorstore = Pinecone(
      index, embed.embed_query, text_field
    )
    # Query the vectorized data
    vectorstore.similarity_search(
      input_text,  # our search query
      k=3  # return 3 most relevant docs
    )
    # Using LangChain we pass in our model for text generation.
    llm=ChatOpenAI(temperature=0.5, model_name="gpt-3.5-turbo", max_tokens=512)
    qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
    )
    # Finally we return the result of the search query. 
    result = qa.run(input_text)
    print(result)
    response = result
    
    return response

After creating the query function, you can create a basic interface to interact with it using Gradio. See example below.

import gradio as gr

#Create your gradio Interface
iface = gr.Interface(fn=data_querying,
                         inputs=gr.inputs.Textbox(lines=7, label="Enter your text"),
                         outputs="text",
                         title="Test Model RENE")

# Launches Gradio App  
iface.launch(shar=True)

Once You Launch your Gradio app you have a fully working query engine with the ability to efficiently search and retrieve relevant information quickly. Congratulations!

Conclusion

In closing, this guide provides a thorough blueprint for constructing a sophisticated query engine utilizing Pinecone and LangChain. By harnessing the power of Pinecone’s vector-based methodology and LangChain’s advanced language models, we can engineer a system that navigates extensive text data with efficiency and generates contextually relevant responses. The combination of data retrieval and natural language generation technologies opens up a myriad of possibilities, ranging from document retrieval systems to intelligent chatbots, emphasizing the transformative potential of these cutting-edge tools.