How to run h2oGPT on Amazon EC2

4 min readJul 6, 2023

h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. Documents help to ground LLMs against hallucinations by providing them context relevant to the instruction. h2oGPT is fully permissive Apache V2 open-source project for 100% private and secure use of LLMs and document embeddings for document question-answer.
Reference: https://github.com/h2oai/h2ogpt

How to run h2oGPT on Amazon EC2

Step 1 — Launch Amazon EC2 instance

Create a Amazon EC2 instance using AWS CloudFormation

Region: us-east-1
AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
Instance: g5.2xlarge
EBS volume: 500GB

h2oGPT.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
 KeyName:
   Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
   Type: AWS::EC2::KeyPair::KeyName
   ConstraintDescription: must be the name of an existing EC2 KeyPair.

Mappings:
  RegionToAmiId:
    us-east-1:
      AMI: ami-0649417d1ede3c91a

Resources:
  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupName: !Sub ${AWS::StackName}-sg
      GroupDescription: Security group for EC2 instance
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
  EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: g5.2xlarge
      ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
      KeyName: !Ref KeyName
      BlockDeviceMappings:
        - DeviceName: /dev/sda1
          Ebs:
            VolumeSize: 500
            VolumeType: gp2
      "Tags" : [
        {"Key" : "Name", "Value" : "h2oGPT"},
      ]
      SecurityGroups:
        - Ref: SecurityGroup

Outputs:
  PublicDNS:
    Description: Public DNSName of the newly created EC2 instance
    Value: !GetAtt [EC2Instance, PublicDnsName]
  PublicIP:
    Description: Public IP address of the newly created EC2 instance
    Value: !GetAtt [EC2Instance, PublicIp]

AWS CloudFormation > Create stack

AWS CloudFormation — Step 1 Create stack

Upload TextGen-WebUI.yaml template file and Next.

AWS CloudFormation — Step 2 Specify stack details

Specify Stack name and KeyName and Next.

AWS CloudFormation — Step 3 Configure stack options

Use default settings and Next.

AWS CloudFormation — Step 4 Review and Submit.

Step 2 — Install h2oGPT

SSH to Amazon EC2 instance and start JupyterLab

# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu

Set up SSH tunnel using local port forwarding to JupyterLab

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open JupyterLab in your local browser

To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.

h2oGPT Installation

Install h2oGPT from JupyterLab terminal.

# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate

# Create a new conda environment
conda create -n h2ogpt python=3.10.9 -y
conda activate h2ogpt
pip install --quiet ipykernel
python -m ipykernel install --user --name h2oGPT --display-name h2oGPT

Clone Github Repo

# Clone Github Repo
git clone https://github.com/h2oai/h2ogpt.git
cd h2ogpt

Install CUDA ToolKit

# Install CUDA ToolKit
conda install cudatoolkit-dev -c conda-forge -y
export CUDA_HOME=$CONDA_PREFIX

# Install dependencies
# GPU only:
pip install -r requirements.txt --extra-index https://download.pytorch.org/whl/cu117

Install document question-answer dependencies

# May be required for jq package:
sudo apt-get install autoconf libtool
# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
# Required for CPU: LLaMa/GPT4All:
pip install -r reqs_optional/requirements_optional_gpt4all.txt
# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt
# Optional: support docx, pptx, ArXiv, etc. required by some python packages
sudo apt-get install -y libmagic-dev poppler-utils tesseract-ocr libtesseract-dev libreoffice
# Optional: for supporting unstructured package
python -m nltk.downloader all
# Optional but required for PlayWright
playwright install --with-deps

Step 3 — Start h2oGPT

python generate.py \
--base_model=h2oai/h2ogpt-4096-llama2-13b-chat \
--load_8bit=True \
--score_model=None \
--langchain_mode='UserData' \
--user_path=user_path

Set up SSH tunnel using local port forwarding to TextGen Web UI

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open TextGen Web UI in your local browser

To access the h2oGPT Web UI, copy and paste http://127.0.0.1:7860/ in your local browser after setting up a SSH tunnel.

Useful Links:

h2oGPT Github: https://github.com/h2oai/h2ogpt
Live Demo: https://gpt.h2o.ai/, https://gpt-gm.h2o.ai/
Technical Paper: https://arxiv.org/pdf/2306.08161.pdf

How to run h2oGPT on Amazon EC2

How to run h2oGPT on Amazon EC2

Step 1 — Launch Amazon EC2 instance

Create a Amazon EC2 instance using AWS CloudFormation

Step 2 — Install h2oGPT

SSH to Amazon EC2 instance and start JupyterLab

Set up SSH tunnel using local port forwarding to JupyterLab

Open JupyterLab in your local browser

h2oGPT Installation

Step 3 — Start h2oGPT

Set up SSH tunnel using local port forwarding to TextGen Web UI

Open TextGen Web UI in your local browser

Written by David Min