DATA STORIES | LLMs | KNIME ANALYTICS PLATFORM

KNIME, AI Extension and local Large Language Models (LLM)

Leverage the power of LLMs without data privacy concerns

Markus Lauber

Published in

Low Code for Data Science

7 min readOct 11, 2023

Editor’s Note. GPT4All has discontinued support for models in .bin format from GPT4All v2.5.0 (Oct 19, 2023) and newer (read more). The new supported models are in GGUF format (.gguf). Starting with KNIME 5.2, the GPT4All Chat Model Connector will support the new model format. The workflow used has been adapted (ML) — Dec 1, 2023.

Starting with the version 5.1 KNIME utilizes artificial intelligence (AI) and Large Language Models (LLM) to boost the productivity of its platform. I want to give you an overview of what is currently out there and also introduce a small Data App where you can chat with a local GPT4All LLM model in KNIME without sending data to the internet.

Edit 05/2024: You can do something similar with just KNIME 4+ and REST-API: “Chat with local Llama3 Model via Ollama in KNIME Analytics Platform — Also extract Logs into structured JSON Files”
I recommend to follow Ángel Molina Laguna who inspired many themes and examples of this article.

KNIME AI Assistant. You can now chat with an AI Assistant in KNIME and also directly build workflows. As of Q3/2023 this is beta but it offers some insight into what will be possible
Coming with version 5.2 there will be an advanced Python Editor that also should have a connection to an AI based code support. (You can download a nightly build, the Python AI assistant is not active yet)
Then there are the “KNIME AI Extension” nodes where you can connect to systems like ChatGPT (you will need an API Key) and you can integrate that into your KNIME workflows. Here your data will be sent to OpenAI or another third party
You can provide your own Vector Store which (again) will be sent to the 3rd party system
Then you can access local LLM models thru GPT4All (like variants of Llama and others) and you can prompt with your own data/questions. This will only use local resources and not send your data to the internet. The performance will very much depend on your machine, obviously
I have built something like a small local chat system in KNIME that you can use as a demonstration (see below)
Starting with KNIME 5.2 you can use your own (local) Vector Store for your documents like PDF (see: “Creating a Local LLM Vector Store from PDFs with KNIME and GPT4All”)

If you want to see what to do with ChatGPT and KNIME and Python in general you can check out my article: “KNIME, ChatGPT and Python”

A local GPT4All Chat App with KNIME

In order to use the local LLMs models you will need to have KNIME 5.1 or later and:

Install GPT4All on your machine. There were some problems with the Windows installer but they should be fixed
Install the “KNIME AI Extension”. If you run into problems behind a firewall check the remarks at the end of this article
Download at least one GPT4All model (the <…>.bin, new: <…>.gguf files) from the collection and place it in your local folder. Please note for which (commercial) use cases these models are licensed! Also check the FAQ.

In the KNIME workflow “GPT4All — Chat DataApp” you can now browse the collection and select a model you want (best to download the whole workflow group):

an KNIME workflow to browse your local GPT4All LLM directory and select the model you want — Browse your local GPT4All LLM directory and select the model you want (https://hub.knime.com/-/spaces/-/latest/~KAqrLVTKv7dCUG1T/).

The LLM Prompter can now be used to answer (“Response”) a list of questions (“Prompt”) and store them in table or as an HTML file. The results would not have any knowledge of the question before.

A KNIME workflow where you can ask the selected models blocks of questions — You can ask the selected models blocks of questions (https://hub.knime.com/-/spaces/-/latest/~KAqrLVTKv7dCUG1T/).

You can now decide what to do with the results. Handle them further with a Topic Detection or use them in other KNIME nodes. You will have to think about what are good prompts and gain experiences which models would yield good results. Please note that a system like ChatGPT has additional features and a greater power than you local machine (or server).

Then I created a small data app that would utilize these functions so you have a little chat window in KNIME. You can query the model and your previous answers will be stored in a KNIME table.

A KNIME Data App to chat with a GPT4All model — In the KNIME Data App there is a simple chat function. It only uses a local LLM without sending out your data and questions (https://hub.knime.com/-/spaces/-/latest/~KAqrLVTKv7dCUG1T/).

After some initial (very basic) tries I based the Chat App on this example from the hub — so Kudos to the KNIME team!

Please note: although this app does look nice, the chat as of now does not store the session and will not be able to refer to previous parts of the conversation. Also as of now you cannot insert your own vector store. For these things you will have to use the generic GPT4All app or Python code if you want to use this locally — or you can revert to something like OpenAI with an API key (edit: starting with KNIME Version 5.2 you can have your own local Vector Store).

If you are interested in KNIME Data Apps in general you can check out this article: “Create an Interactive Dashboard with KNIME Components and Python”

GPT4All generic App

You can also use the GPT4All local app to create a chat or let it write code. The performance will depend on the power of your machine — you can see how many tokens per second you can get. This also depends on the (size of) model you chose.

Screenshot of the generic GPT4All chat app — The GPT4All app can write formatted Python code and answer questions. Without sending the content to the internet.

You can (try) and specify a GPU engine if your system has one — although I have experienced that my laptop’s GPU will not always bee used (no idea why). On an Apple Silicon M1 with activated GPU support in the advanced settings I have seen speed of up to 60 tokens per second — which is not so bad for a local system.

GPT4All in Python and as an API

In addition you can also use GPT4All with your Python environment thru the local app which will function as an API server. There is a native package and you can also use the one from Openai with a local server.

GPT4All Installation behind a Firewall

If you are behind a firewall you might have to take additional steps in order to load KNIME Python based extensions like the “LLM Prompter”:

Create a New Python based KNIME Extension

Write your Python extension! Tutorial, setup of a Python extension, and API.

docs.knime.com

install your extensions on another system that has full internet access (it can be a different operating system from your target environment). Make sure you really have the exact same version of KNIME
collect all your Python packages thru the KNIME settings (“Download required packages for offline installation to”).
It will download all the Python packages for all environments (Windows, Mac, Apple Silicon and Linux) in the correct version
bring the folder to your machine behind the firewall (zip it maybe)
set up a system environment variable on the target machine “KNIME_PYTHON_PACKAGE_REPO_URL” with the path of this folder
install the extensions on the firewall-machine (it will use the local repository from the environment variable)
as of Q1/2024 there is a bug when using the GPT4All offline or behind a firewall (which this is all about) — you will have to make some adaptions in some code.

If you encounter problems see the LOG file and find the folder with the generated list of PIP installed extensions (pip_pkg_urls.txt). Remove all references to external sites (like “https://files.pythonhosted.org/packages …)” that might be there and try again

Another LLM project to explore by H2O.ai (https://gpt.h2o.ai/, https://github.com/h2oai/h2o-llmstudio).

On MacOS Apple Silicon you might want (and have to) activate the option to force GPT4All to use the GPU from your processor — which seems to give quite good results

Activate GPU support on macOS / Apple Silicon

Some additional articles and examples you might like:

Creating a Local LLM Vector Store from PDFs with KNIME and GPT4All

KNIME is constantly adapting and integrating AI and Large Language Models in its software. Starting with version 5.2…

medium.com

K-AI Assistant: Data Analysis with ChatGPT & KNIME

Let AI answer your KNIME questions and build workflows for you

medium.com

Bringing Artificial Intelligence to your Team: GPT4All and KNIME without Internet Connection by Ángel Molina Laguna:

Bringing Artificial Intelligence to your Team: GPT4All and KNIME without Internet Connection

How GPT4All and KNIME change the game by enabling local execution of AI models

medium.com

The Power of Artificial Intelligence to Analyze Opinions in Hotel Reviews

The weekly Just KNIME It challenges, Season 2, no. 20

medium.com

JKISeason2-20 Ángel Molina - mola_data

Desafío 20: Temas en Reseñas de Hoteles Nivel difícil Descripción: trabaja para una agencia de viajes y desea…

hub.knime.com

If you are interested in why ChatGPT is just so good you can take a look at this article (spoiler alert: computing power and using ton of human feedback and not just piles of internet data):

RLHF: Reinforcement Learning from Human Feedback

A narrative that is often glossed over in the demo frenzy is about the incredible technical creativity that went into…

huyenchip.com

DATA STORIES | LLMs | KNIME ANALYTICS PLATFORM

KNIME, AI Extension and local Large Language Models (LLM)

Leverage the power of LLMs without data privacy concerns

A local GPT4All Chat App with KNIME

GPT4All generic App

GPT4All in Python and as an API

GPT4All Installation behind a Firewall

Create a New Python based KNIME Extension

Write your Python extension! Tutorial, setup of a Python extension, and API.

Creating a Local LLM Vector Store from PDFs with KNIME and GPT4All

KNIME is constantly adapting and integrating AI and Large Language Models in its software. Starting with version 5.2…

K-AI Assistant: Data Analysis with ChatGPT & KNIME

Let AI answer your KNIME questions and build workflows for you

Bringing Artificial Intelligence to your Team: GPT4All and KNIME without Internet Connection

How GPT4All and KNIME change the game by enabling local execution of AI models

The Power of Artificial Intelligence to Analyze Opinions in Hotel Reviews

The weekly Just KNIME It challenges, Season 2, no. 20

JKISeason2-20 Ángel Molina - mola_data

Desafío 20: Temas en Reseñas de Hoteles Nivel difícil Descripción: trabaja para una agencia de viajes y desea…

RLHF: Reinforcement Learning from Human Feedback

A narrative that is often glossed over in the demo frenzy is about the incredible technical creativity that went into…

Written by Markus Lauber