How to run h2oGPT on Amazon EC2

David Min
4 min readJul 6, 2023

--

Stable Diffusion AI Art

h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. Documents help to ground LLMs against hallucinations by providing them context relevant to the instruction. h2oGPT is fully permissive Apache V2 open-source project for 100% private and secure use of LLMs and document embeddings for document question-answer.

Reference: https://github.com/h2oai/h2ogpt

How to run h2oGPT on Amazon EC2

Step 1 — Launch Amazon EC2 instance

Create a Amazon EC2 instance using AWS CloudFormation

  • Region: us-east-1
  • AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
  • Instance: g5.2xlarge
  • EBS volume: 500GB

h2oGPT.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
KeyName:
Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
Type: AWS::EC2::KeyPair::KeyName
ConstraintDescription: must be the name of an existing EC2 KeyPair.

Mappings:
RegionToAmiId:
us-east-1:
AMI: ami-0649417d1ede3c91a

Resources:
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub ${AWS::StackName}-sg
GroupDescription: Security group for EC2 instance
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0
EC2Instance:
Type: AWS::EC2::Instance
Properties:
InstanceType: g5.2xlarge
ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
KeyName: !Ref KeyName
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
VolumeSize: 500
VolumeType: gp2
"Tags" : [
{"Key" : "Name", "Value" : "h2oGPT"},
]
SecurityGroups:
- Ref: SecurityGroup

Outputs:
PublicDNS:
Description: Public DNSName of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicDnsName]
PublicIP:
Description: Public IP address of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicIp]

AWS CloudFormation > Create stack

AWS CloudFormation

AWS CloudFormation — Step 1 Create stack

Upload TextGen-WebUI.yaml template file and Next.

AWS CloudFormation — Step 1 Create stack

AWS CloudFormation — Step 2 Specify stack details

Specify Stack name and KeyName and Next.

AWS CloudFormation — Step 2 Specify stack details

AWS CloudFormation — Step 3 Configure stack options

Use default settings and Next.

AWS CloudFormation — Step 3 Configure stack options

AWS CloudFormation — Step 4 Review and Submit.

Step 2 — Install h2oGPT

SSH to Amazon EC2 instance and start JupyterLab

# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu

Set up SSH tunnel using local port forwarding to JupyterLab

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open JupyterLab in your local browser

To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.

JupyterLab

h2oGPT Installation

Install h2oGPT from JupyterLab terminal.

# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate

# Create a new conda environment
conda create -n h2ogpt python=3.10.9 -y
conda activate h2ogpt
pip install --quiet ipykernel
python -m ipykernel install --user --name h2oGPT --display-name h2oGPT

Clone Github Repo

# Clone Github Repo
git clone https://github.com/h2oai/h2ogpt.git
cd h2ogpt

Install CUDA ToolKit

# Install CUDA ToolKit
conda install cudatoolkit-dev -c conda-forge -y
export CUDA_HOME=$CONDA_PREFIX

# Install dependencies
# GPU only:
pip install -r requirements.txt --extra-index https://download.pytorch.org/whl/cu117

Install document question-answer dependencies

# May be required for jq package:
sudo apt-get install autoconf libtool
# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt
# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt
# Optional: support docx, pptx, ArXiv, etc. required by some python packages
sudo apt-get install -y libmagic-dev poppler-utils tesseract-ocr libtesseract-dev libreoffice
# Optional: for supporting unstructured package
python -m nltk.downloader all
# Optional but required for PlayWright
playwright install --with-deps

Step 3 — Start h2oGPT

python generate.py \
--base_model=h2oai/h2ogpt-4096-llama2-13b-chat \
--load_8bit=True \
--score_model=None \
--langchain_mode='UserData' \
--user_path=user_path

Set up SSH tunnel using local port forwarding to TextGen Web UI

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open TextGen Web UI in your local browser

To access the h2oGPT Web UI, copy and paste http://127.0.0.1:7860/ in your local browser after setting up a SSH tunnel.

h2oGPT UI

Useful Links:

Stable Diffusion AI Art

--

--