How to run LocalGPT on Amazon EC2

David Min
5 min readJun 12, 2023

--

Stable Diffusion AI Art

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.

Ask questions to your documents without an internet connection, using the power of LLMs. 100% private, no data leaves your execution environment at any point. You can ingest documents and ask questions without an internet connection!

Built with LangChain and Vicuna-7B (+ alot more!) and InstructorEmbeddings

Source: https://github.com/PromtEngineer/localGPT

How to run LocalGPT on Amazon EC2

Step 1 — Launch Amazon EC2 Instance

Create a Amazon EC2 instance using AWS CloudFormation

  • Region: us-east-1
  • AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
  • Instance: g5.2xlarge
  • EBS volume: 500GB

LocalGPT.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
KeyName:
Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
Type: AWS::EC2::KeyPair::KeyName
ConstraintDescription: must be the name of an existing EC2 KeyPair.

Mappings:
RegionToAmiId:
us-east-1:
AMI: ami-0649417d1ede3c91a

Resources:
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub ${AWS::StackName}-sg
GroupDescription: Security group for EC2 instance
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0
EC2Instance:
Type: AWS::EC2::Instance
Properties:
InstanceType: g5.2xlarge
ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
KeyName: !Ref KeyName
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
VolumeSize: 500
VolumeType: gp2
"Tags" : [
{"Key" : "Name", "Value" : "LocalGPT"},
]
SecurityGroups:
- Ref: SecurityGroup

Outputs:
PublicDNS:
Description: Public DNSName of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicDnsName]
PublicIP:
Description: Public IP address of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicIp]

AWS CloudFormation > Create stack

AWS CloudFormation

AWS CloudFormation — Step 1 Create stack

Upload LocalGPT.yaml template file and Next.

AWS CloudFormation — Step 1 Create stack

AWS CloudFormation — Step 2 Specify stack details

Specify Stack name and KeyName and Next.

AWS CloudFormation — Step 2 Specify stack details

AWS CloudFormation — Step 3 Configure stack options

Use default settings and Next.

AWS CloudFormation — Step 3 Configure stack options

AWS CloudFormation — Step 4 Review and Submit.

Step 2 — Install LocalGPT

SSH to Amazon EC2 instance and start JupyterLab

# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu

Set up SSH tunnel using local port forwarding to JupyterLab

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open JupyterLab in your local browser

To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.

JupyterLab

LocalGPT Installation

Install LocalGPT from JupyterLab terminal.

# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate

# Create a new conda environment
conda create -n localGPT python=3.10.9 -y
conda activate localGPT
pip install --quiet ipykernel
python -m ipykernel install --user --name localGPT --display-name localGPT

#Install Pytorch
pip3 install torch torchvision torchaudio

# Install LocalGPT
git clone https://github.com/PromtEngineer/localGPT
cd localGPT
pip install -r requirements.txt

# Install xformers
pip install xformers

# Install AutoGPTQ
pip install auto-gptq
# If you have problems installing AutoGPTQ, please build from source instead:
# git clone https://github.com/PanQiWei/AutoGPTQ
# cd AutoGPTQ
# pip3 install .

Step 3 — Run LocalGPT

LLM Model

Choose your LLM Model_ID and MODEL_BASENAME from constants.py.

# for GPTQ (quantized) models
MODEL_ID = "TheBloke/Llama-2-7b-Chat-GPTQ"
MODEL_BASENAME = "model.safetensors"

GPU VRAM Memory required for LLM Models by Billion Parameter value (B Model)

#### GPU VRAM Memory required for LLM Models by Billion Parameter value (B Model)
####
#### (B Model) (float32) (float16) (GPTQ 8bit) (GPTQ 4bit)
#### 7b 28 GB 14 GB 7 GB 3.5 GB
#### 13b 52 GB 26 GB 13 GB 6.5 GB
#### 32b 130 GB 65 GB 32.5 GB 16.25 GB
#### 65b 260.8 GB 130.4 GB 65.2 GB 32.6 GB

Upload your documents

Upload your documents in SOURCE_DOCUMENTS folder using JupyterLab File Browser.

JupyterLab — File Browser

Ingest documents

# defaults to cuda
python ingest.py

Run LocalGPT

source /home/ubuntu/miniconda/bin/activate
conda activate localGPT

# defaults to cuda
python run_localGPT.py

Wait for the script to require your input.

> Enter a query:

Enter a query and hit enter.

Type exit to finish the script.

Run LocalGPT UI

LLM Model

Choose your LLM Model_ID and MODEL_BASENAME from constants.py.

Run LocalGPT API

# Terminal 1
source /home/ubuntu/miniconda/bin/activate
conda activate localGPT
python run_localGPT_API.py

# Running on http://127.0.0.1:5110

Run LocalGPT UI

# Terminal 2
source /home/ubuntu/miniconda/bin/activate
conda activate localGPT
streamlit run localGPT_UI.py --browser.serverAddress localhost

Set up SSH tunnel using local port forwarding to LocalGPT UI

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8501:localhost:8501 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open LocalGPT UI in your local browser

Open http://localhost:8501/ in your local browser after setting up a SSH tunnel.

LocalGPT UI

Useful Links:

Related Articles:

--

--