How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2

7 min readJun 9, 2023

Stable Diffusion AI Art (Stable Diffusion XL)

Text Generation Web UI (TGWUI/Oobabooga)
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (ggml), Llama models.
Source: https://github.com/oobabooga/text-generation-webui

How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2

Step 1 — Launch Amazon EC2 Instance

Create a Amazon EC2 instance using AWS CloudFormation

Region: us-east-1
AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
Instance: g5.2xlarge
EBS volume: 500GB

TextGen-WebUI.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
 KeyName:
   Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
   Type: AWS::EC2::KeyPair::KeyName
   ConstraintDescription: must be the name of an existing EC2 KeyPair.

Mappings:
  RegionToAmiId:
    us-east-1:
      AMI: ami-0649417d1ede3c91a

Resources:
  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupName: !Sub ${AWS::StackName}-sg
      GroupDescription: Security group for EC2 instance
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
  EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: g5.2xlarge
      ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
      KeyName: !Ref KeyName
      BlockDeviceMappings:
        - DeviceName: /dev/sda1
          Ebs:
            VolumeSize: 500
            VolumeType: gp2
      "Tags" : [
        {"Key" : "Name", "Value" : "textgen-webui"},
      ]
      SecurityGroups:
        - Ref: SecurityGroup

Outputs:
  PublicDNS:
    Description: Public DNSName of the newly created EC2 instance
    Value: !GetAtt [EC2Instance, PublicDnsName]
  PublicIP:
    Description: Public IP address of the newly created EC2 instance
    Value: !GetAtt [EC2Instance, PublicIp]

AWS CloudFormation > Create stack

AWS CloudFormation — Step 1 Create stack

Upload TextGen-WebUI.yaml template file and Next.

AWS CloudFormation — Step 2 Specify stack details

Specify Stack name and KeyName and Next.

AWS CloudFormation — Step 3 Configure stack options

Use default settings and Next.

AWS CloudFormation — Step 4 Review and Submit.

Step 2 — Install TextGen Web UI

SSH to Amazon EC2 instance and start JupyterLab

# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu

Set up SSH tunnel using local port forwarding to JupyterLab

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open JupyterLab in your local browser

To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.

TextGen Web UI Installation

Install TextGen Web UI from JupyterLab terminal.

# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate

# Create a new conda environment
conda create -n textgen python=3.10.9 -y
conda activate textgen
pip install --quiet ipykernel
python -m ipykernel install --user --name TextGen --display-name TextGen

#Install Pytorch
pip3 install torch torchvision torchaudio

# Install the web UI
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt

Step 3 — Start TextGen Web UI

Start TextGen Web UI

# Start TextGen Web UI
source /home/ubuntu/miniconda/bin/activate
conda activate textgen
cd text-generation-webui
python server.py

(textgen) ubuntu@ip-172-31-68-63:~/text-generation-webui$ python server.py
2023-08-21 04:53:17 INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

Set up SSH tunnel using local port forwarding to TextGen Web UI

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 7860:localhost:7860 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open TextGen Web UI in your local browser

To access the TextGen Web UI, copy and paste http://127.0.0.1:7860/ in your local browser after setting up a SSH tunnel.

Download Open Source LLM Model

Model > Download custom model or LoRA

TheBloke/Llama-2–7b-Chat-GPTQ

Model > click refresh.

Model > Model loader — ExLlama and Load.

Prompt template: Llama-2-Chat

[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

{prompt} [/INST]

Parameters > max_new_tokens: 2048
Prompt: ELI5 Generative AI with examples.

Example output of “ELI5 Generative AI with examples.” prompt.

Of course! I’d be happy to help explain generative AI in a safe and respectful manner. 😊
Generative AI refers to a type of artificial intelligence that can create new and original content, such as images, videos, music, text, or even entire conversations. This is different from other types of AI that focus on recognizing patterns or making predictions based on existing data.
Here are some examples of how generative AI works:
1. Image generation: Imagine you have a picture of a cat. Generative AI can take this image and create new, unique pictures of cats that look like they were drawn by a talented artist. These new images could be used in various applications, such as creating realistic-looking avatars for video games or generating artwork for websites.
2. Text generation: Generative AI can write text that sounds like it was written by a human. This could be useful for tasks like writing articles, summarizing long documents, or even composing entire stories. For example, if you gave me a prompt about a fictional world, I could use generative AI to generate an entire story set in that world, complete with characters, plot twists, and descriptions of the setting. 📚
3. Music generation: Imagine you want to create a new song but don’t have any musical skills. Generative AI can help! By analyzing existing songs and understanding the structure of music, generative AI can create entirely new songs that sound like they were composed by a human musician. This could be used for various applications, such as creating background music for videos or generating soundtracks for video games. 🎵
4. Voice generation: Generative AI can also generate realistic-sounding voices, which could be useful in various applications such as voice assistants, language learning tools, or even film and television productions. Imagine being able to create a virtual assistant with a unique personality and tone — it’s possible with generative AI! 💬
These are just a few examples of what generative AI can do. The possibilities are endless, and I hope this explanation has helped you understand the concept better! If you have any questions or need further clarification, feel free to ask. 😊
Source: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ LLM Model

TextGen API

GitHub:oobabooga/text-generation-webui A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Once text-generation-webui is installed and confirmed working via the web interface, please enable the api option either through the web model configuration tab, or by adding the run-time arg --api to your start command.
Source: https://python.langchain.com/docs/integrations/llms/textgen

Start TextGen Web UI API

# Start TextGen Web UI API
source /home/ubuntu/miniconda/bin/activate
conda activate textgen
cd text-generation-webui
python server.py --listen --api

(textgen) ubuntu@ip-172-31-68-63:~/text-generation-webui$ python server.py --listen --api
Starting streaming server at ws://0.0.0.0:5005/api/v1/stream
2023-08-21 05:28:25 INFO:Loading the extension "gallery"...
Starting API at http://0.0.0.0:5000/api
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.

OpenAI Extension

An OpenedAI API (openai like) — This extension creates an API that works kind of like openai (ie. api.openai.com).

# Install the requirements
pip install flask_cloudflared==0.0.12 sentence-transformers tiktoken

# enable openai extension
python server.py --extensions openai

# Sample .env file:
OPENAI_API_KEY=sk-111111111111111111111111111111111111111111111111
OPENAI_API_BASE=http://127.0.0.1:5001/v1

Models

TheBloke/vicuna-13b-v1.3.0-GPTQ
TheBloke/stable-vicuna-13B-GPTQ
TheBloke/airoboros-13B-gpt4–1.3-GPTQ

Useful Links:

https://github.com/oobabooga/text-generation-webui
https://python.langchain.com/docs/integrations/llms/textgen
https://github.com/flurb18/AgentOoba
Extensions
An OpenedAI API Extension (openai like) — This extension creates an API that works kind of like openai (ie. api.openai.com).

How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2

How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2

Step 1 — Launch Amazon EC2 Instance

Create a Amazon EC2 instance using AWS CloudFormation

Step 2 — Install TextGen Web UI

SSH to Amazon EC2 instance and start JupyterLab

Set up SSH tunnel using local port forwarding to JupyterLab

Open JupyterLab in your local browser

TextGen Web UI Installation

Step 3 — Start TextGen Web UI

Start TextGen Web UI

Set up SSH tunnel using local port forwarding to TextGen Web UI

Open TextGen Web UI in your local browser

Download Open Source LLM Model

Prompt template: Llama-2-Chat

TextGen API

Start TextGen Web UI API

OpenAI Extension

Useful Links:

Related Articles:

Written by David Min