Step-by-Step Guide to Setup LocalGPT on Your Windows PC

Arunkl
TheSecMaster
Published in
13 min readSep 21, 2023

--

user looking at the desktop monitor with his headset on sitting on his work desk
Source: thesecmaster.com

The field of artificial intelligence (AI) has seen monumental advances in recent years, largely driven by the emergence of large language models (LLMs). LLMs trained on vast datasets, are capable of working like humans, at some point in time, a way better than humans like generate remarkably human-like text, images, calculations, and many more. In essence, these LLMs are the actual brains of AI applications today. However, the broad deployment of public LLMs has also raised valid concerns about data privacy, security, reliability, and cost.

As AI permeates critical domains like healthcare, finance and more, transmitting sensitive data to public cloud APIs can expose users to unprecedented risks. Dependency on external services also increases vulnerabilities to outages, while usage-based pricing limits widespread adoption. This underscores the need for AI solutions that run entirely on the user’s local device.

Several open-source initiatives have recently emerged to make LLMs accessible privately on local machines. One such initiative is LocalGPT — an open-source project enabling fully offline execution of LLMs on the user’s computer without relying on any external APIs or internet connectivity.

LocalGPT overcomes the key limitations of public cloud LLMs by keeping all processing self-contained on the local device. Users can leverage advanced NLP capabilities for information retrieval, summarization, translation, dialogue and more without worrying about privacy, reliability or cost. Documents never leave the vicinity of the device at any point in time.

In this comprehensive guide, we will walk through the step-by-step process of setting up LocalGPT on a Windows PC from scratch. We cover the essential prerequisites, installation of dependencies like Anaconda and Visual Studio, cloning the LocalGPT repository, ingesting sample documents, querying the LLM via the command line interface, and testing the end-to-end workflow on a local machine.

Follow this guide to harness the power of large language models locally on your Windows device for a private, high-performance LLM solution.

Table of contents

· Introduction of LocalGPT
· What LocalGPT Carries the Benefits over the Private GPT Project?
· Prerequisites to Run the LocalGPT on a Windows PC
· How to Setup LocalGPT on Your Windows PC?
· Bottom Line

Introduction of LocalGPT

LocalGPT is an open-source project inspired by privateGPT that enables running large language models locally on a user’s device for private use. The original Private GPT project proposed the idea of executing the entire LLM pipeline natively without relying on external APIs. However, it was limited to CPU execution which constrained performance and throughput.

LocalGPT builds on this idea but makes key improvements by using more efficient models and adding support for hardware acceleration via GPUs and other co-processors. Instead of the GPT-4ALL model used in privateGPT, LocalGPT adopts the smaller yet highly performant LLM Vicuna-7B. For generating semantic document embeddings, it uses InstructorEmbeddings rather than LlamaEmbeddings. Unlike privateGPT which only leveraged the CPU, LocalGPT can take advantage of installed GPUs to significantly improve throughput and response latency when ingesting documents as well as querying the model. The project readme highlights Blenderbot, Guanaco-7B, and WizardLM-7B as some of the compatible LLMs that can be used for privatization.

The default setup uses Vicuna-7B for text generation and InstructorEmbeddings for encoding document context vectors which are indexed locally using Chroma. However a key advantage is that these models can be readily swapped based on specific use cases and hardware constraints.

By keeping the entire pipeline limited to the local device while enabling acceleration using available hardware like GPUs, LocalGPT unlocks more efficient privatization of large language models for offline NLP tasks. Users get access to advanced natural language capabilities without compromising on privacy, reliability, or cost.

According to the moderators of LocalGPT, the project is still experimental. However, our belief is that it shows promising potential for building fully private AI applications across diverse domains like healthcare, finance, and more where data privacy and compliance are paramount.

What LocalGPT Carries the Benefits over the Private GPT Project?

One of the biggest advantages LocalGPT has over the original privateGPT is support for diverse hardware platforms including multi-core CPUs, GPUs, IPUs, and TPUs.

By contrast, privateGPT was designed to only leverage the CPU for all its processing. This limited execution speed and throughput especially for larger models.

LocalGPT’s ability to offload compute-intensive operations like embedding generation and neural inference to available co-processors provides significant performance benefits:

  • Faster response times — GPUs can process vector lookups and run neural net inferences much faster than CPUs. This reduces query latencies.
  • Higher throughput — Multi-core CPUs and accelerators can ingest documents in parallel. This increases overall throughput.
  • More efficient scaling — Larger models can be handled by adding more GPUs without hitting a CPU bottleneck.
  • Lower costs — Accelerators are more cost-efficient for massively parallel workloads compared to high core-count CPUs.
  • Flexibility — Different models and workflows can be mapped to suitable processors like IPUs for inference and TPUs for training.
  • Portability — Can leverage hardware from all major vendors like Nvidia, Intel, AMD, etc.

So while privateGPT was limited to single-threaded CPU execution, LocalGPT unlocks more performance, flexibility, and scalability by taking advantage of modern heterogeneous computing. Even on laptops with integrated GPUs, LocalGPT can provide significantly snappier response times and support larger models not possible on privateGPT.

For users with access to desktop GPUs or enterprise accelerators, LocalGPT makes local privatization of LLMs much more practical across diverse settings — from individual users to large organizations dealing with confidential data.

By decoupling model execution from the underlying hardware, LocalGPT makes local LLM privatization faster, more affordable, and accessible to a much wider audience. This aligns well with its open-source ethos of AI privacy and security for all.

Prerequisites to Run the LocalGPT on a Windows PC

To install and run LocalGPT on your Windows PC, there are some minimum system requirements that need to be met. Please ensure these minimum requirements before you get started.

Operating System — You need Windows 10 or higher, 64-bit edition. Older Windows versions are not supported.

RAM — LocalGPT requires at least 16GB RAM, while 32GB is recommended for optimal performance, especially with larger models.

GPU — For leveraging GPU acceleration, an Nvidia GPU with a CUDA compute capability of 3.5 or higher is necessary. CUDA-enabled GPUs provide significant speedups versus just CPU.

Storage — 250GB of free disk space is required as LocalGPT databases can grow large depending on the documents ingested. SSD storage is preferred.

Software dependencies:

  • Anaconda or Miniconda for Python environment management. Python 3.10 or later is required.
  • Visual Studio 2022 provides the necessary C++ build tools and compilers. Ensure the desktop development workload with C++ is selected during installation.
  • Git is required for cloning the LocalGPT repository from GitHub.
  • MinGW provides the gcc compiler needed to compile certain Python packages.
  • Docker Desktop (optional) — Provides a containerized environment to simplify managing LocalGPT dependencies.
  • Nvidia Container Toolkit to enable GPU support when running LocalGPT via Docker.

Additionally, an internet connection is required for the initial installation to download the required packages and models.

Ensuring these prerequisites are met before starting the LocalGPT installation will ensure a smooth setup process and avoid frustrating errors down the line. Pay particular attention to GPU driver versions, CUDA versions, and Visual Studio workloads during installation.

How to Setup LocalGPT on Your Windows PC?

Now, you have gotten enough knowledge about LocalGPT. Let’s go ahead an see how to setup LocalGPT on your Windows PC.

Time needed: 2 hours.

How to Setup LocalGPT on Your Windows PC?

  1. Install Visual Studio 2022

Visual Studio 2022 is an integrated development environment (IDE) that we’ll use to run commands and edit code.

Go to visualstudio.microsoft.com and download the free Community version of Visual Studio 2022. Run through the Visual Studio Installer and make sure to select the following components:

  1. Universal Windows Platform development
  2. Desktop Development with C++
Install Visual Studio 2022
Install Visual Studio 2022

2. Download the LocalGPT Source Code or Clone the Repository

Now we need to download the source code for LocalGPT itself. There are a couple of ways to do this:

Option 1 — Clone with Git

If you’re familiar with Git, you can clone the LocalGPT repository directly in Visual Studio:
1. Choose a local path to clone it to, like C:\LocalGPT
2. Change the directory to your local path on the CLI and run this command: > git clone https://github.com/PromtEngineer/localGPT.git
3. Click Clone
This will download all the code to your chosen folder.

Option 2 – Download as ZIP

If you aren’t familiar with Git, you can download the source as a ZIP file:
1. Go to https://github.com/imartinez/privateGPT in your browser
2. Click on the green “<> Code” button and choose “Download ZIP”
3. Extract the ZIP somewhere on your computer, like C:\privateGPT
Either cloning or downloading the ZIP will work!

We have downloaded the source code, unzipped it into the ‘LocalGPT’ folder, and kept in G:\LocalGPT on our PC.

Download the LocalGPT Source Code
Download the LocalGPT Source Code

3. Import the LocalGPT into an IDE

The next step is to import the unzipped ‘LocalGPT’ folder into an IDE application. We used PyCharm IDE in this demo. You can use Visual Studio 2022 or even it is okay to directly run the CLI.

If you want to set up PyCharm on your Windows, follow this guide: https://thesecmaster.com/step-by-step-procedure-to-install-pycharm-on-windows/

To Import the LocalGPT as a project on PyCharm, Click on the ‘Four Lines’ button on the top left corner and click ‘Open.’ Browse the LocalGPT folder.

Import the LocalGPT into an IDE
Import the LocalGPT into an IDE

4. Install Anaconda

We will use Anaconda to set up and manage the Python environment for LocalGPT.
1. Download the latest Anaconda installer for Windows from https://www.anaconda.com/products/distribution
2. Choose Python 3.10 or higher during installation.
3. Complete the installation process and restart your terminal.
4. Open the Anaconda Prompt which will have the Conda environment activated by default.

To verify the installation is successful, fire up the ‘Anaconda Prompt’ and enter this command: conda –version.

Refer to these online documents for installation, setting up the environmental variable, and troubleshooting:
https://docs.anaconda.com/free/anaconda/install/windows/

Verify the conda installation by checking the version info
Verify the conda installation by checking the version info

5. Create and Activate LocalGPT Environment

It’s best practice to install LocalGPT in a dedicated Conda environment instead of the base env. This keeps the dependencies isolated.
Run the following commands in Anaconda Prompt:

conda create -n localgpt
conda activate localgpt

Create and Activate LocalGPT Environment
Create and Activate LocalGPT Environment

6. Change to Anaconda Python Interpreter on PyCharm

Your PC could have multiple Python Interpreters the one that comes with PyCharm, the second comes with the installation of Anaconda, and there may be another interpreter that came along with the installation of Python from python.org. Make sure you use the Anaconda Python interpreter on PyCharm. To do so, go to the Settings gear icon on the top right corner of your project in Pycharm, Go to ‘Settings’, and Select the Project and Python Interpreter. You should see all the interpreters listed in the drop-down. Select the interpreter that comes with Anaconda. If you don’t see the interpreter, Click on ‘Add Interpreter’ and select the ‘python.exe’ location.

If in case, you are not sure where the ‘python.exe’ exists. Open your ‘Anaconda Prompt’ and run the command: where python.

Change to Anaconds Python Interpreter on PyCharm
Change to Anaconds Python Interpreter on PyCharm

7. Install Required Python Packages

Now we need to install the Python package requirements so LocalGPT can run properly. Run this command to install all the packages listed in the ‘requirements.txt’ file on the terminal.

pip install -r .\requirements.txt

This will install all of the required Python packages using pip. Depending on your internet speed, this may take a few minutes.

If you run into any errors during this step, you may need to install a C++ compiler. See the LocalGPT README on GitHub for help troubleshooting compiler issues on Windows.

Install Required Python Packages
Install Required Python Packages

8. Install Packages Required to Run on GPU (Optional)

LocalGPT requires some essential packages to be installed if you want to run the LLM model on your GPU. This is an optional step for those who have an NVIDIA GPU card on their machine.

Run the following to install Conda packages:

conda install cmake pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

This installs Pytorch, Cuda toolkit, and other Conda dependencies

Install Packages Required to Run on GPU
Install Packages Required to Run on GPU

9. Install MinGW Compiler

MinGW provides gcc, the default C++ compiler used by Python and its packages.
1. Download the latest MinGW installer from https://sourceforge.net/projects/mingw/
2. Run the exe and select the mingw32-gcc-g++-bin package under Basic Setup.
3. Leave other options as default and complete the MinGW installation.
4. Finally, add MinGW to your PATH environment variable so it’s accessible from the command line.

Install MinGW Compiler
Install MinGW Compiler

10. Install Docker (Optional)

Docker allows running LocalGPT in isolated containers for managing dependencies easily.
1. Download and install Docker Desktop from https://www.docker.com/products/docker-desktop
2. Install the latest Nvidia Container Toolkit from https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html to enable GPU support
3. In Settings, enable Docker’s WSL 2 backend and install the Ubuntu distro.
4. Allocate adequate resources to the Docker VM.
Docker is now ready to build and run LocalGPT containers.

Step 11: Build LocalGPT Docker Image

If you have Docker installed, build a Docker image to run LocalGPT in isolated containers:

docker build . -t localgpt

11. Ingest Documents

Now we’re ready to ingest documents into the local vector database. This preprocesses your files so LocalGPT can search and query them. C
In the PyCharm terminal, run:

python .\ingest.py

This will look for files in the source_documents folder, parses, and encodes the document contents into vector embeddings, storing them in an indexed local database.

You can add .pdf, .docx, .txt, and other files in this folder ‘source_documents’. The initial process may take some time depending on how large your files are and how much computational resources your PC has. If you run this on the CPU, the ingest process would take longer than the GPU.

LocalGPT is designed to run the ingest.py file on GPU as a default device type. However, if your PC doesn’t have CODA-supported GPU then it runs on a CPU.

Well, LocalGPT provided an option to choose the device type, no matter if your device has a GPU. You can select the device type by adding this flag –device_type to the command.

Ex:

python ingest.py –device_type CPUpython ingest.py –device_type coda
python ingest.py –device_type ipu

To see the list of device types, run this –help flag: python ingest.py –help

Once it finishes, your documents are ready to query!

Ingest Documents to LocalGPT
Ingest Documents to LocalGPT

12. Query Your Documents

With documents ingested, we can ask LocalGPT questions relevant to them:

In the terminal, run:

python .\privateGPT.py

It will prompt you to enter a question. Ask something relevant to the sample documents like:

What is Privilege Escalation?

LocalGPT will provide an appropriate answer by searching through the ingested document contents.

You can keep entering new questions, or type exit to quit.

Note: LocalGPT provided an options to choose the device type, no matter if your device has GPU. You can select the device type by adding this flag –device_type to the command.

Ex:

python run_localGPT.py –device_type cpu
python run_localGPT.py –device_type coda
python run_localGPT.py –device_type ipu

To see the list of device type, run this –help flag: python run_localGPT.py –help

Query Your Documents
Query Your Documents

13. Use a Different LLM

By default, LocalGPT uses Vicuna-7B model. But you can replace it with any HuggingFace model:
1. Open constants.py in an editor.
2. Modify MODEL_ID and MODEL_BASENAME as per the instructions in the LocalGPT readme.
3. Comment out other redundant model variables.
4. Restart LocalGPT services for changes to take effect.

And that’s it! This is how you can setup LocalGPT on your Windows machine. You can ingest your own document collections, customize models, and build private AI apps leveraging its local LLM capabilities.

Note: If you use the CPU to run LLM, you may need to wait a long time to see responses. We recommend to run this on GPU.

FYI: We tried this on one of our Windows PCs which has an Intel i7 7700 processor, 32 Gb RAM with 4 Gb GTX 1050 GPU. We get an average repose time of 60 to 90 sec on the CPU. Unfortunately, we couldn’t run this on GPU, due to version compatibility issues with PyTorch and CUDA Took KIt. We keep trying this and let you know once we have succeeded. If you are one of toes who successfully ran this on your local GPU, please leave a comment.

Bottom Line

Being able to leverage the power of large language models locally on your device provides tremendous opportunities to build intelligent applications privately. However, installing and configuring complex deep-learning software can seem daunting for many Windows users.

In this comprehensive, step-by-step guide, we simplified the process by detailing the exact prerequisites, dependencies, environment setup, installation steps, and configurations required to get LocalGPT up and running on a Windows PC.

By closely following the instructions outlined and checking the system requirements beforehand, you should be able to successfully install LocalGPT on your Windows 10 or 11 machine without major issues. We also covered how to ingest sample documents, query the model, and customize the underlying LLM as per your application needs.

While still experimental, LocalGPT enables you to unlock the myriad capabilities of large language models to create personalized AI solutions that keep your data completely secure and private. No documents or information is ever transmitted outside your computer.

We hope this guide served as a helpful reference manual to setup LocalGPT on your Windows device. Let me know if you have any other questions in the comments! We thank for reading this blog post. Visit our website, thesecmaster.com, and social media pages on Facebook, LinkedIn, Twitter, Telegram, Tumblr, & Medium and subscribe to receive updates like this.

This post is originally published at thesecmaster.com

We thank everybody who has been supporting our work and request you check out thesecmaster.com for more such articles.

--

--