PrivateGPT v0.4.0 for Mac: LM Studio & Ollama
Run PrivateGPT Locally with LM Studio and Ollama — updated for v0.4.0
Welcome to the updated version of my guides on running PrivateGPT v0.4.0 locally with LM Studio and Ollama. In response to growing interest & recent updates to the code of PrivateGPT, this article builds upon the foundation laid in previous (now out-of-date) articles (1 and 2) to provide you with the latest insights and instructions for setting up PrivateGPT on your Apple Silicon Mac.
Disclaimer: Before diving in, it’s important to note that the procedures outlined here have been specifically tested on a Mac (M1, 32GB). I have not conducted tests on Linux or Windows environments. Should you encounter any issues during setup or execution, I highly recommend you refer to the PrivateGPT Documentation.
Let’s Get Started!
This article takes you from setting up conda, getting PrivateGPT installed, and running it from Ollama (which is recommended by PrivateGPT) and LMStudio for even more model flexibility. Enjoy:
Create a Conda Environment
Cleaner to install it in a conda environment. Needs to be ≥ Python 3.11.
conda create -n pri-gpt python=3.11
conda activate pri-gpt
Git Clone PrivateGPT
I navigate to my preferred development directory cd Developer
and then clone the repo:
git clone https://github.com/zylon-ai/private-gpt.git
and then change director to private-gpt:
cd private-gpt
Enable PrivateGPT to use: Ollama and LM Studio
In order to run this locally, I’ll show how to do this from Ollama and LM Studio.
Ollama is enabled with llms-ollama
and LM Studio is enabled with llms-openai-like
Embeddings for Ollama are from embeddings-ollama
Embeddings for LM Studio are from embeddings-huggingface
I’ve also added the vector-stores-qdrant
for storing the embeddings, and ui
so we have a user interface.
poetry install --extras "ui llms-ollama embeddings-ollama embeddings-huggingface llms-openai-like vector-stores-qdrant"
Scenario 1: Using Ollama
Ollama is the most straightforward way to get PrivateGPT running locally. For this, you will need to download ollama.
Set Ollama Env Variables
Set the PGPT_PROFILES
environment variable:
export PGPT_PROFILES=ollama
(check it with echo $PGPT_PROFILES
)
Download the LLM & Embedding Model
In a new terminal window (we’ll call it the ollama terminal window):
ollama pull nomic-embed-text
(274 MB)
ollama pull mistral
(4.1 GB)
Start Ollama Running
In the ollama terminal window, run:
ollama serve
Run PrivateGPT
In the same terminal window as you set the PGPT_Profile earlier, run:
make run
Navigate to the UI & Test it Out
Scenario 2: Using LM Studio
LM Studio is more flexible than ollama, since you can choose from many more models. For this, you will need to install LM Studio.
Set vllm environment variable
Set the PGPT_PROFILES
environment variable:
export PGPT_PROFILES=vllm
(check it with echo $PGPT_PROFILES
)
Edit the vllm YAML file
To use LM Studio to serve the model, we will edit the settings-vllm.yaml
file:
On line 15: change this api_base: http://localhost:8000/v1
to this URL instead: http://localhost:1234/v1
to match the port LM studio returns the openai-like
inference responses:
openai:
api_base: http://localhost:1234/v1
LLM & Embedding
Note: The model you select needs to match the emebdding model in terms of the dimensions. For this article, I’m using a 768-dimension embedding model & LLM.
Start your LLM Inference Server in LM Studio
In LM Studio start the Inference server using the model shown in the screenshot above. You can find it on HuggingFace here: https://huggingface.co/ikawrakow/various-2bit-sota-gguf or search for “sota” in LM Studio.
Select & Set Your Embedding Model from HuggingFace
On line 12 of settings-vllm.yaml
I’ve changed the embedding_hf_model_name: BAAI/bge-small-en-v1.5
to BAAI/bge-base-en
in order for PrivateGPT to work (the embedding dimensions need to be the same as the model):
huggingface:
embedding_hf_model_name: BAAI/bge-base-en
Feel free to use a different model / embedding (just make sure the dimensions are the same otherwise you will get an error like this:
Run PrivateGPT
In the same terminal window as you set the PGPT_Profile earlier, run:
make run
Navigate to the UI & Test it Out
As always, let me know if you have any comments or feedback. I look forward to hearing your thoughts.