Building your own chat interface to your data WITHOUT the OpenAI API

3 min readMay 4, 2023

What you can do with OpenAI’s models is fascinating. Also, tools such as LangChain and llama-index make it really easy to get a basic ChatGPT-like system up and running in few lines of code. However, most of the examples build on OpenAI’s APIs which is not practical in all cases, e.g. because you cannot send your data into the cloud or you cannot spend the money.

Query your system in a ChatGPT-like way on your own data and without OpenAI.

As I recently tried to get a basic system running using databrick’s dolly and it needed a little bit of trial and error here my quick tutorial on how to use a custom, open LLM to build a chat interface to your own data!

Step 1: Collect your data and install dependencies

For this step, just collect all the data you want to use and place it into a directory on your local machine. In my case, these were a bunch of markdown files I pulled from the docs of our data curation tool Spotlight (check it out too ;-)).

Next, install everything you need:

pip install torch transformers langchain llama-index====0.6.0.alpha3

Step 2: Define your system and build index

Copy the following code and adjust the path to your input folder. It uses the Huggingface transformers library to generate embeddings for retrieval and databrick’s dolly to generate the final output.

from pathlib import Path
import torch
from transformers import pipeline
from langchain.llms.base import LLM
from llama_index import SimpleDirectoryReader, LangchainEmbedding, GPTVectorStoreIndex, PromptHelper, LLMPredictor, ServiceContext
from llama_index.langchain_helpers.text_splitter import TokenTextSplitter
from llama_index.node_parser.simple import SimpleNodeParser
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

INPUT_FOLDER = "path/to/your/data/folder"

index_files = list(Path(INPUT_FOLDER).glob("*"))

max_input_size = 2048
num_output = 256
max_chunk_overlap = 20
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)


pipe = pipeline("text-generation", model="databricks/dolly-v2-3b", trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto")
embed_model = LangchainEmbedding(HuggingFaceEmbeddings())

class CustomLLM(LLM):
    model_name = "databricks/dolly-v2-3b"

    def _call(self, prompt, stop = None):
        response = pipe(prompt, max_new_tokens=num_output)[0]["generated_text"]
        return response

    @property
    def _identifying_params(self):
        return {"name_of_model": self.model_name}

    @property
    def _llm_type(self):
        return "custom"

# define our LLM
llm_predictor = LLMPredictor(llm=CustomLLM())

node_parser = SimpleNodeParser(text_splitter=TokenTextSplitter(chunk_size=512, chunk_overlap=max_chunk_overlap))
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, embed_model=embed_model, prompt_helper=prompt_helper, node_parser=node_parser, chunk_size_limit=512)
# Load your data
documents = SimpleDirectoryReader(input_files=index_files).load_data()

index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)

query_engine = index.as_query_engine()

Run the code to build your document index.

Step 3: Use your system

We are already done! You can now use the query engine object to ask questions about your data!

An example prompt I tried on our docs is something like this:

print(query_engine.query("Summarize typical use cases of the similarity map in few sentences."))

The response here is:

The Similarity Map helps to explore simulation similarities and find explanations for observable phenomens.
It can be used to group designs by similarity, find outliers and detect correlations between features and target values.

This is fairly accurate, although the answer is a little too specific to a concrete use case. Note that, of course, a really important aspect is, that selecting the right data to feed into the system is still a challenge. If you want to learn more about this topic, feel free to get in touch, e.g., via our solutions page, or just check out our free data curation tool spotlight.

Building your own chat interface to your data WITHOUT the OpenAI API

Step 1: Collect your data and install dependencies

Step 2: Define your system and build index

Step 3: Use your system

Written by Daniel Klitzke