DATA STORIES | LLMs | KNIME ANALYTICS PLATFORM

KNIME, AI Extension and local Large Language Models (LLM)

Leverage the power of LLMs without data privacy concerns

Markus Lauber
Low Code for Data Science
7 min readOct 11, 2023

--

Photo by Michael Dziedzic on Unsplash.

Editor’s Note. GPT4All has discontinued support for models in .bin format from GPT4All v2.5.0 (Oct 19, 2023) and newer (read more). The new supported models are in GGUF format (.gguf). Starting with KNIME 5.2, the GPT4All Chat Model Connector will support the new model format. The workflow used has been adapted (ML) Dec 1, 2023.

Starting with the version 5.1 KNIME utilizes artificial intelligence (AI) and Large Language Models (LLM) to boost the productivity of its platform. I want to give you an overview of what is currently out there and also introduce a small Data App where you can chat with a local GPT4All LLM model in KNIME without sending data to the internet.

Edit 05/2024: You can do something similar with just KNIME 4+ and REST-API: “Chat with local Llama3 Model via Ollama in KNIME Analytics Platform — Also extract Logs into structured JSON Files

I recommend to follow Ángel Molina Laguna who inspired many themes and examples of this article.

  • KNIME AI Assistant. You can now chat with an AI Assistant in KNIME and also directly build workflows. As of Q3/2023 this is beta but it offers some insight into what will be possible
  • Coming with version 5.2 there will be an advanced Python Editor that also should have a connection to an AI based code support. (You can download a nightly build, the Python AI assistant is not active yet)
  • Then there are the “KNIME AI Extension” nodes where you can connect to systems like ChatGPT (you will need an API Key) and you can integrate that into your KNIME workflows. Here your data will be sent to OpenAI or another third party
  • You can provide your own Vector Store which (again) will be sent to the 3rd party system
  • Then you can access local LLM models thru GPT4All (like variants of Llama and others) and you can prompt with your own data/questions. This will only use local resources and not send your data to the internet. The performance will very much depend on your machine, obviously
  • I have built something like a small local chat system in KNIME that you can use as a demonstration (see below)
  • Starting with KNIME 5.2 you can use your own (local) Vector Store for your documents like PDF (see: “Creating a Local LLM Vector Store from PDFs with KNIME and GPT4All”)

If you want to see what to do with ChatGPT and KNIME and Python in general you can check out my article: “KNIME, ChatGPT and Python

A local GPT4All Chat App with KNIME

In order to use the local LLMs models you will need to have KNIME 5.1 or later and:

  • Install GPT4All on your machine. There were some problems with the Windows installer but they should be fixed
  • Install the “KNIME AI Extension”. If you run into problems behind a firewall check the remarks at the end of this article
  • Download at least one GPT4All model (the <…>.bin, new: <…>.gguf files) from the collection and place it in your local folder. Please note for which (commercial) use cases these models are licensed! Also check the FAQ.

In the KNIME workflow “GPT4All — Chat DataApp” you can now browse the collection and select a model you want (best to download the whole workflow group):

an KNIME workflow to browse your local GPT4All LLM directory and select the model you want
Browse your local GPT4All LLM directory and select the model you want (https://hub.knime.com/-/spaces/-/latest/~KAqrLVTKv7dCUG1T/).

The LLM Prompter can now be used to answer (“Response”) a list of questions (“Prompt”) and store them in table or as an HTML file. The results would not have any knowledge of the question before.

A KNIME workflow where you can ask the selected models blocks of questions
You can ask the selected models blocks of questions (https://hub.knime.com/-/spaces/-/latest/~KAqrLVTKv7dCUG1T/).

You can now decide what to do with the results. Handle them further with a Topic Detection or use them in other KNIME nodes. You will have to think about what are good prompts and gain experiences which models would yield good results. Please note that a system like ChatGPT has additional features and a greater power than you local machine (or server).

Then I created a small data app that would utilize these functions so you have a little chat window in KNIME. You can query the model and your previous answers will be stored in a KNIME table.

A KNIME Data App to chat with a GPT4All model
In the KNIME Data App there is a simple chat function. It only uses a local LLM without sending out your data and questions (https://hub.knime.com/-/spaces/-/latest/~KAqrLVTKv7dCUG1T/).

After some initial (very basic) tries I based the Chat App on this example from the hub — so Kudos to the KNIME team!

Please note: although this app does look nice, the chat as of now does not store the session and will not be able to refer to previous parts of the conversation. Also as of now you cannot insert your own vector store. For these things you will have to use the generic GPT4All app or Python code if you want to use this locally — or you can revert to something like OpenAI with an API key (edit: starting with KNIME Version 5.2 you can have your own local Vector Store).

If you are interested in KNIME Data Apps in general you can check out this article: “Create an Interactive Dashboard with KNIME Components and Python

GPT4All generic App

You can also use the GPT4All local app to create a chat or let it write code. The performance will depend on the power of your machine — you can see how many tokens per second you can get. This also depends on the (size of) model you chose.

Screenshot of the generic GPT4All chat app
The GPT4All app can write formatted Python code and answer questions. Without sending the content to the internet.

You can (try) and specify a GPU engine if your system has one — although I have experienced that my laptop’s GPU will not always bee used (no idea why). On an Apple Silicon M1 with activated GPU support in the advanced settings I have seen speed of up to 60 tokens per second — which is not so bad for a local system.

GPT4All in Python and as an API

In addition you can also use GPT4All with your Python environment thru the local app which will function as an API server. There is a native package and you can also use the one from Openai with a local server.

GPT4All Installation behind a Firewall

If you are behind a firewall you might have to take additional steps in order to load KNIME Python based extensions like the “LLM Prompter”:

  • install your extensions on another system that has full internet access (it can be a different operating system from your target environment). Make sure you really have the exact same version of KNIME
  • collect all your Python packages thru the KNIME settings (“Download required packages for offline installation to”).
    It will download all the Python packages for all environments (Windows, Mac, Apple Silicon and Linux) in the correct version
  • bring the folder to your machine behind the firewall (zip it maybe)
  • set up a system environment variable on the target machine “KNIME_PYTHON_PACKAGE_REPO_URL” with the path of this folder
  • install the extensions on the firewall-machine (it will use the local repository from the environment variable)
  • as of Q1/2024 there is a bug when using the GPT4All offline or behind a firewall (which this is all about) — you will have to make some adaptions in some code.

If you encounter problems see the LOG file and find the folder with the generated list of PIP installed extensions (pip_pkg_urls.txt). Remove all references to external sites (like “https://files.pythonhosted.org/packages …)” that might be there and try again

Another LLM project to explore by H2O.ai (https://gpt.h2o.ai/, https://github.com/h2oai/h2o-llmstudio).

If you are interested in why ChatGPT is just so good you can take a look at this article (spoiler alert: computing power and using ton of human feedback and not just piles of internet data):

--

--

Markus Lauber
Low Code for Data Science

Senior Data Scientist working with KNIME, Python, R and Big Data Systems in the telco industry