How to run LocalGPT on Amazon EC2

5 min readJun 12, 2023

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
Ask questions to your documents without an internet connection, using the power of LLMs. 100% private, no data leaves your execution environment at any point. You can ingest documents and ask questions without an internet connection!
Built with LangChain and Vicuna-7B (+ alot more!) and InstructorEmbeddings
Source: https://github.com/PromtEngineer/localGPT

How to run LocalGPT on Amazon EC2

Step 1 — Launch Amazon EC2 Instance

Create a Amazon EC2 instance using AWS CloudFormation

Region: us-east-1
AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
Instance: g5.2xlarge
EBS volume: 500GB

LocalGPT.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
 KeyName:
   Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
   Type: AWS::EC2::KeyPair::KeyName
   ConstraintDescription: must be the name of an existing EC2 KeyPair.

Mappings:
  RegionToAmiId:
    us-east-1:
      AMI: ami-0649417d1ede3c91a

Resources:
  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupName: !Sub ${AWS::StackName}-sg
      GroupDescription: Security group for EC2 instance
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
  EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: g5.2xlarge
      ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
      KeyName: !Ref KeyName
      BlockDeviceMappings:
        - DeviceName: /dev/sda1
          Ebs:
            VolumeSize: 500
            VolumeType: gp2
      "Tags" : [
        {"Key" : "Name", "Value" : "LocalGPT"},
      ]
      SecurityGroups:
        - Ref: SecurityGroup

Outputs:
  PublicDNS:
    Description: Public DNSName of the newly created EC2 instance
    Value: !GetAtt [EC2Instance, PublicDnsName]
  PublicIP:
    Description: Public IP address of the newly created EC2 instance
    Value: !GetAtt [EC2Instance, PublicIp]

AWS CloudFormation > Create stack

AWS CloudFormation — Step 1 Create stack

Upload LocalGPT.yaml template file and Next.

AWS CloudFormation — Step 2 Specify stack details

Specify Stack name and KeyName and Next.

AWS CloudFormation — Step 3 Configure stack options

Use default settings and Next.

AWS CloudFormation — Step 4 Review and Submit.

Step 2 — Install LocalGPT

SSH to Amazon EC2 instance and start JupyterLab

# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu

Set up SSH tunnel using local port forwarding to JupyterLab

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open JupyterLab in your local browser

To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.

LocalGPT Installation

Install LocalGPT from JupyterLab terminal.

# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate

# Create a new conda environment
conda create -n localGPT python=3.10.9 -y
conda activate localGPT
pip install --quiet ipykernel
python -m ipykernel install --user --name localGPT --display-name localGPT

#Install Pytorch
pip3 install torch torchvision torchaudio

# Install LocalGPT
git clone https://github.com/PromtEngineer/localGPT
cd localGPT
pip install -r requirements.txt

# Install xformers
pip install xformers

# Install AutoGPTQ
pip install auto-gptq
# If you have problems installing AutoGPTQ, please build from source instead:
# git clone https://github.com/PanQiWei/AutoGPTQ
# cd AutoGPTQ
# pip3 install .

Step 3 — Run LocalGPT

LLM Model

Choose your LLM Model_ID and MODEL_BASENAME from constants.py.

# for GPTQ (quantized) models
MODEL_ID = "TheBloke/Llama-2-7b-Chat-GPTQ"
MODEL_BASENAME = "model.safetensors"

GPU VRAM Memory required for LLM Models by Billion Parameter value (B Model)

#### GPU VRAM Memory required for LLM Models by Billion Parameter value (B Model)
####
#### (B Model)   (float32)    (float16)  (GPTQ 8bit)  (GPTQ 4bit)
####    7b         28 GB        14 GB       7 GB         3.5 GB    
####    13b        52 GB        26 GB       13 GB        6.5 GB    
####    32b        130 GB       65 GB       32.5 GB      16.25 GB 
####    65b        260.8 GB     130.4 GB    65.2 GB      32.6 GB

Source: https://github.com/PromtEngineer/localGPT/blob/main/constants.py

Upload your documents

Upload your documents in SOURCE_DOCUMENTS folder using JupyterLab File Browser.

Ingest documents

# defaults to cuda
python ingest.py

Run LocalGPT

source /home/ubuntu/miniconda/bin/activate
conda activate localGPT

# defaults to cuda
python run_localGPT.py

Wait for the script to require your input.

> Enter a query:

Enter a query and hit enter.

Type exit to finish the script.

Run LocalGPT UI

LLM Model

Choose your LLM Model_ID and MODEL_BASENAME from constants.py.

Run LocalGPT API

# Terminal 1
source /home/ubuntu/miniconda/bin/activate
conda activate localGPT
python run_localGPT_API.py

# Running on http://127.0.0.1:5110

Run LocalGPT UI

# Terminal 2
source /home/ubuntu/miniconda/bin/activate
conda activate localGPT
streamlit run localGPT_UI.py --browser.serverAddress localhost

Set up SSH tunnel using local port forwarding to LocalGPT UI

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8501:localhost:8501 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open LocalGPT UI in your local browser

Open http://localhost:8501/ in your local browser after setting up a SSH tunnel.

How to run LocalGPT on Amazon EC2

How to run LocalGPT on Amazon EC2

Step 1 — Launch Amazon EC2 Instance

Create a Amazon EC2 instance using AWS CloudFormation

Step 2 — Install LocalGPT

SSH to Amazon EC2 instance and start JupyterLab

Set up SSH tunnel using local port forwarding to JupyterLab

Open JupyterLab in your local browser

LocalGPT Installation

Step 3 — Run LocalGPT

LLM Model

GPU VRAM Memory required for LLM Models by Billion Parameter value (B Model)

Upload your documents

Ingest documents

Run LocalGPT

Run LocalGPT UI

LLM Model

Run LocalGPT API

Run LocalGPT UI

Set up SSH tunnel using local port forwarding to LocalGPT UI

Open LocalGPT UI in your local browser

Useful Links:

Related Articles:

Written by David Min