Ollama and Llama3 — A Streamlit App to convert your files into local Vector Stores and chat with them using the latest LLMs

5 min readJul 5, 2024

Chatting with the content of your files like PDFs, CSVs or Texts is one of the latest hypes of the LLM wave (just check out all the articles on Medium). To demonstrate how to do this locally with the latest models like Llama3 or Mistral I put together a Streamlit app in Python code to use Ollama to convert PDFs, CSVs and just text documents into Vector Stores to then be fed to the model. I tried Chroma and Meta’s FAISS vector stores.

An Overview about what the Streamlit App does

The interface should be pretty self-explanatory. Providing an instruction and choosing your encoding model (to process the files) and then the LLM itself.

You can try the app yourself with your files (https://github.com/ml-score/ollama)

You can choose three types of Files

You can select one or more documents of one type. Keep in mind they will be processed locally and the app is there as a demonstration. So if you intent to crash it you might as well succeed :-)

PDFs — the text will be extracted. You can opt to chunk the text into portions and also define an overlap in order to give context around a match the chance to be included in both directions
CSVs — by default they will be imported in a structured way. One row per document. From experience the models can detect that you have several databases for example and make connections between them. You might have to increase the number of documents to be retrieved
Text — this can be pretty unstructured and you can have them one document per file. This can help to work with Log files

Not everything has to go into a vector store. For CSV files or databases you might just use SQL or let a model write SQL code. Also for Logfiles there might be dedicated log parsers of you can use Regex (or let the LLM write the Regex).

Obviously since this is running on a local machine the power will be limited. Apple Silicon (M1, M2, … processors) seem to do quite well. But then this whole setup is there to keep your data private and not send it to the cloud. If you have access to a powerful LLM like ChatGPT it will most likely beat this little machine.

But the initial results of Mistral or Llama3 are quite encouraging considering you are not using a gigantic park of GPUs but maybe your Laptop’s local CPU.

Save the Vector Stores locally

You will only have to process the files once. The results will be stored in a local SQLite database or other local files. They can be reused later just by giving the path. The app will then take the information and use them. You can choose how many documents at once you want to retrieve. More documents are more information but also more overhead and processing.

Ask questions in the chat

You can ask questions in the chat like you are used to. In the demo there is a strict instruction to only use information from the documents provided. But you can experiment with that. You also have all the information from the LLMs itself ready to be used (all the internet compiled into one file …). Also the chat will be aware of what has happened before. One will have to see how far back this ‘memory’ might last.

Results will be stored in a JSON file

The results will be stored in a JSON file in the vector store’s folder along with additional information like a timestamp, which model was used and some meta information about your document like the name, the page or the line that was used so you can reference it back later or even extract information from the JSON file (again):

The answers will be stored in a JSON file along the meta data and additional information (https://github.com/ml-score/ollama)

You can try and instruct the LLM for example to give back structured results (from a Log file) and then extract that later into a regular table:

Chat with local Llama 3 Model via Ollama in KNIME Analytics Platform — Also extract Logs into…

Integrate and interact with the latest local LLMs with a low-code tool

medium.com

So you can just start exploring once you have installed alle the necessary things:

How to set up the Streamlit App

You will find the code and descriptions on GitHub:
https://github.com/ml-score/ollama

You will need Python installed and best create a conda environment with the packages you need (py_ollama.yml)
Then install Ollama and make sure you have the model you want downloaded locally — currently there is Llama3 and Mistral (instruct) but you could modify the code to add more
You will need an embedding model (like mxbai-embed-large). You can use the LLMs themselves but this will take more time

More details on how to do all this and start the app here:

ollama/script at main · ml-score/ollama

Work with Ollama and Llama models. Contribute to ml-score/ollama development by creating an account on GitHub.

github.com

As an example you can check out how to extract information from a bank statement into a JSOn file. You could build a loop around this and bring the information back into KNIME (or any other system):

Extract Data from Invoices to XML or CSV?

Hi Community, I'm trying to understand if the basic version of KNIME as an open-source solution meets my requirements…

forum.knime.com

If you want more AI and LLM I can offer these articles dealing with the low-code analytics platform KNIME and (local) LLMs. KNIME also does support the use of platforms like ChatGPT.

Creating a Local LLM Vector Store from PDFs with KNIME and GPT4All
https://medium.com/p/311bf61dd20e
KNIME, AI Extension and local Large Language Models (LLM)
https://medium.com/p/cef650fc142b
Llama3 and KNIME — Build your local Vector Store from PDFs and other Documents (also runs on KNIME 4.8)
https://medium.com/p/237eda761c1c
Chat with local Llama 3 Model via Ollama in KNIME Analytics Platform — Also extract Logs into structured JSON Files (also runs on KNIME 4.8)
https://medium.com/p/aca61e4a690a

The examples for all these use cases you can find on my KNIME Hub:

mlauber71/LLM_Space

contains 11 items

hub.knime.com

If you want to continue with LLMs and KNIME you can check out the latest KNIME Space for Generative AI:

KNIME for Generative AI - knime

Explore this collection of ready to use workflows to get started with using Large Language Models (LLMs). Browse…