DATA STORIES | ARTIFICIAL INTELLIGENCE | KNIME ANALYTICS PLATFORM

Creating a Local LLM Vector Store from PDFs with KNIME and GPT4All

A fully no-code solution!

Markus Lauber
Low Code for Data Science

--

Photo by Christopher Burns on Unsplash.

KNIME is constantly adapting and integrating AI and Large Language Models in its software. Starting with KNIME 5.2 it is possible to use local GPT4All LLMs to create your own vector store from your own documents (like PDFs) and interact with them on your local machine. I will talk about some findings at the end of the article.

You can also read my initial article about: “KNIME, AI Extension and local Large Language Models (LLM)

I adapted a workflow from a KNIME AI Learnathon (them using ChatGPT) to demonstrate the possibilities of local LLMs and Vector Stores:
Create your own LLM Vector Store (from your documents) with GPT4All local models”:

Overview of a KNIME worflow to use local LLMs
KNIME Workflow to create and use a GPT4All LLM and a local Vector Store from your own Document (PDF) (https://forum.knime.com/t/gpt4all-embeddings/75594/5?u=mlauber71).

You will have to install GPT4All and the KNIME AI Extension. If you experience problems please also refer to the “GPT4All Installation behind a Firewall” of this article. Also as of Q1/2024 there is a bug when using the GPT4All offline or behind a firewall (which this is all about) — you will have to make some adaptions in some code.

A manual for a Coffee Machine

The document to explore is the user manual for a coffee machine, a 24 page PDF. The file will be converted into a KNIME document type for text analytics and be ‘fed’ to a (local) GPT4All model.

A) Create Knowledge Base

You convert your PDF after some cleaning into a local Vector Store with the help of the FAISS Vector Store Creator node and a GPT4All embedding model:

KNIME Workflow to create a GPT4All LLM Vector Store
Create a Vector Store (https://hub.knime.com/-/spaces/-/~RgLTaML-8RjQVBfi/current-state/).

Step 1: Download the GPT4All models you want to use

Place your GPT4All models (.gguf extension) in the ../gpt4all_models/ folder. [Download them at: https://gpt4all.io/index.html]. Please note which ones are OK for commercial use!

Step 2: Select your GPT4All model

Select the model (.gguf) you want to use in the component. The name will automatically be added to the text.

KNIME componet to select a GPT4All model
Select your GPT4All model in the component.

Step 3: Divide PDF text into sentences

Search, drag and drop Sentence Extractor node and execute on the column “Document” from the PDF Parser node. This will split the document cell in multiple rows: one row for each sentence. Then use a Row Filter node to remove all sentences below 5 terms. You can experiment with additional text preparations.

Step 4: Create the vector store

Search, drag and drop FAISS Vector Store Creator node, connect to the Embeddings4All Connector and your string sections output. Execute the node on the column with the strings to create the vector store. You can either download a suitable embedding like (currently: all-MiniLM-L6-v2-f16.gguf) or let the node do the work.

Step 5: Save the vector store

Save the vector store by adding a Model Writer node. To save it properly, you can use a relative path and specify the name of the vector store such as “vector_store.model”.

You have now successfully created a local Vector Store.

B) Use an LLM for a Completion Task via the Knowledge Base

Now you can ask questions in a ‘batch’ mode, that is you can send a lot of them to the model using the same basic prompt:

KNIME workflow to query a local LLM model
Place a set of questions to the model and your vector store (https://hub.knime.com/-/spaces/-/~RgLTaML-8RjQVBfi/current-state/).

Step 1: Load the selected GPT4All model

Load the model into the GPT4All Chat Model Connector. Here you can use the Flow Variable from the left side.

Step 2: Adopt the Knowledge Base from the Vector Store

The Model Reader node reads from the workflow data area the vector store you previously created in part A). The Vector Store Retriever will try to find 15 relevant documents to add to your prompt you will pose to the Large Language Model:

Tell the Vector Store Retriever how many documents to find for your question
Tell the Vector Store Retriever how many documents to find for your question.

Step3: Check and if necessary edit the prompt

You might want to edit some information and instructions around the question itself you want to ask. This might be a good place to test the effects of different prompts.

Prompt engineering in a KNIME workflow
Do some prompt engineering and also add the additional informations (adapted from: https://hub.knime.com/-/spaces/-/~WNe6bb2w2bemYBWE/current-state/).

Step 4: Connect to the Vector Store Retriever and the LLM Prompter

Run the questions thru the model and the Vector Store.

Drag in the Vector Store Retriever node and the LLM Prompter node,
in-between add a String Manipulation node for Prompt Engineering.

Step 5: Export Results to an Excel File

You can save the table via an Excel Writer node. Optionally you can compare with a Table View node the answers by the LLM and the ones we imported with the questions for reference.

Step 6: Inspect and compare the results.

C) ‘Live’ Chat with your GPT4All model using your Vector Store

In addition to asking your questions in a batch, I created a KNIME Component where you could ‘live’ chat with the model and the information you provided from the coffee machine manual. The component will take your question, select suitable documents from the vector store and then give you the answer:

A chat window from a KNIME component asking ‘live’ questions based on an LLM and informations from a local Vector Store (based on a PDF)
A live chat with the model based on your PDF’s data (https://hub.knime.com/-/spaces/-/~RgLTaML-8RjQVBfi/current-state/).

The initial task/role is only provided once, though you can change it. You can also edit the prompt within the component if you want.

Step 1: Load Model and Vector Store like before

Load the model into the GPT4All Chat Model Connector. Here you can use the Flow Variable from the left side where you selected the model.

Step 2: Select the initial Role for the prompt and the number of documents to be searched

Besides your precise question, there should be a role being defined and some additional instructions.

What happens inside the component is the answers are stored in a KNIME table and are reloaded so you have your conversation stored and shown in your chat window.

Currently the chat does *not* refer to items that already have been discussed (like it would be with using a live connection to ChatGPT). But the upside is that the conversation does happen just on your own machine without the data being sent to the internet.

Even if you cannot process very large amounts of data (depending on the power of your machine) you might be able to test prompts and see if your vector store might work.

Just start playing with LLMs and KNIME :-)

Some Observations on using the local Chat

I would also like to share some initial observations while talking to the coffee machine manual. They might (or might not) be relevant to other such approaches:

  • Domain knowledge still is a thing. It seems to work best if you already know about coffee and coffee machines and ask specific questions
  • The setup still can sometimes give false answers or make things up. It seems to help to refer to the coffee machine (although this is already done be the ‘wrapper’ around the question). It might be good to experiment with the prompts
  • What does not work well is ‘negativ’ questions or things that are not in the manual. I asked the setup if the machine also can make tea and the answer was somewhat evasive. You can boil water but it did not say that this might not be the best idea. The model seems to be reluctant to say no. Maybe something to add to the prompt
  • You should focus on what you expect to be in the document. I am not sure how well it works to combine it with the general content of the model
  • You should experiment with limiting or expanding the number of (additional) documents provided. Sometimes a smaller number might be better. Also the documents it can find and link will depend on the question you ask (if it can find texts/documents linked to your input)
  • When creating KNIME document types (that you use to train the Vector store) you can add additional metadata, like author, page number, title and so on. It might be useful to try and add that to answers to have some reference

So (again) please be aware that this is a language model trained to sum up things and to some extent also to please the user; so be aware. It is not some sort of general intelligence.

Note: this article has been edited to describe more precisely the use of embedding models when creating vector stores.

If you enjoyed this article you can follow me on Medium (https://medium.com/@mlxl) and on the KNIME forum (https://forum.knime.com/u/mlauber71/summary) and hub (https://hub.knime.com/mlauber71).

--

--

Markus Lauber
Low Code for Data Science

Senior Data Scientist working with KNIME, Python, R and Big Data Systems in the telco industry