How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2

David Min
7 min readJun 9, 2023

--

Stable Diffusion AI Art (Stable Diffusion XL)

Text Generation Web UI (TGWUI/Oobabooga)

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (ggml), Llama models.

Source: https://github.com/oobabooga/text-generation-webui

How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2

Step 1 — Launch Amazon EC2 Instance

Create a Amazon EC2 instance using AWS CloudFormation

  • Region: us-east-1
  • AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
  • Instance: g5.2xlarge
  • EBS volume: 500GB

TextGen-WebUI.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
KeyName:
Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
Type: AWS::EC2::KeyPair::KeyName
ConstraintDescription: must be the name of an existing EC2 KeyPair.

Mappings:
RegionToAmiId:
us-east-1:
AMI: ami-0649417d1ede3c91a

Resources:
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub ${AWS::StackName}-sg
GroupDescription: Security group for EC2 instance
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0
EC2Instance:
Type: AWS::EC2::Instance
Properties:
InstanceType: g5.2xlarge
ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
KeyName: !Ref KeyName
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
VolumeSize: 500
VolumeType: gp2
"Tags" : [
{"Key" : "Name", "Value" : "textgen-webui"},
]
SecurityGroups:
- Ref: SecurityGroup

Outputs:
PublicDNS:
Description: Public DNSName of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicDnsName]
PublicIP:
Description: Public IP address of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicIp]

AWS CloudFormation > Create stack

AWS CloudFormation

AWS CloudFormation — Step 1 Create stack

Upload TextGen-WebUI.yaml template file and Next.

AWS CloudFormation — Step 1 Create stack

AWS CloudFormation — Step 2 Specify stack details

Specify Stack name and KeyName and Next.

AWS CloudFormation — Step 2 Specify stack details

AWS CloudFormation — Step 3 Configure stack options

Use default settings and Next.

AWS CloudFormation — Step 3 Configure stack options

AWS CloudFormation — Step 4 Review and Submit.

Step 2 — Install TextGen Web UI

SSH to Amazon EC2 instance and start JupyterLab

# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu

Set up SSH tunnel using local port forwarding to JupyterLab

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open JupyterLab in your local browser

To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.

JupyterLab

TextGen Web UI Installation

Install TextGen Web UI from JupyterLab terminal.

# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate

# Create a new conda environment
conda create -n textgen python=3.10.9 -y
conda activate textgen
pip install --quiet ipykernel
python -m ipykernel install --user --name TextGen --display-name TextGen

#Install Pytorch
pip3 install torch torchvision torchaudio

# Install the web UI
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
Install TextGen Web UI from JupyterLab terminal

Step 3 — Start TextGen Web UI

Start TextGen Web UI

# Start TextGen Web UI
source /home/ubuntu/miniconda/bin/activate
conda activate textgen
cd text-generation-webui
python server.py
(textgen) ubuntu@ip-172-31-68-63:~/text-generation-webui$ python server.py
2023-08-21 04:53:17 INFO:Loading the extension "gallery"...
Running on local URL: http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

Set up SSH tunnel using local port forwarding to TextGen Web UI

# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 7860:localhost:7860 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com

Open TextGen Web UI in your local browser

To access the TextGen Web UI, copy and paste http://127.0.0.1:7860/ in your local browser after setting up a SSH tunnel.

TextGen Web UI

Download Open Source LLM Model

Model > Download custom model or LoRA

TextGen Web UI > Model

Model > click refresh.

Model > Model loader — ExLlama and Load.

TextGen Web UI — Model > Load

Prompt template: Llama-2-Chat

[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

{prompt} [/INST]
  • Parameters > max_new_tokens: 2048
  • Prompt: ELI5 Generative AI with examples.

Example output of “ELI5 Generative AI with examples.” prompt.

Of course! I’d be happy to help explain generative AI in a safe and respectful manner. 😊

Generative AI refers to a type of artificial intelligence that can create new and original content, such as images, videos, music, text, or even entire conversations. This is different from other types of AI that focus on recognizing patterns or making predictions based on existing data.

Here are some examples of how generative AI works:

1. Image generation: Imagine you have a picture of a cat. Generative AI can take this image and create new, unique pictures of cats that look like they were drawn by a talented artist. These new images could be used in various applications, such as creating realistic-looking avatars for video games or generating artwork for websites.

2. Text generation: Generative AI can write text that sounds like it was written by a human. This could be useful for tasks like writing articles, summarizing long documents, or even composing entire stories. For example, if you gave me a prompt about a fictional world, I could use generative AI to generate an entire story set in that world, complete with characters, plot twists, and descriptions of the setting. 📚

3. Music generation: Imagine you want to create a new song but don’t have any musical skills. Generative AI can help! By analyzing existing songs and understanding the structure of music, generative AI can create entirely new songs that sound like they were composed by a human musician. This could be used for various applications, such as creating background music for videos or generating soundtracks for video games. 🎵

4. Voice generation: Generative AI can also generate realistic-sounding voices, which could be useful in various applications such as voice assistants, language learning tools, or even film and television productions. Imagine being able to create a virtual assistant with a unique personality and tone — it’s possible with generative AI! 💬

These are just a few examples of what generative AI can do. The possibilities are endless, and I hope this explanation has helped you understand the concept better! If you have any questions or need further clarification, feel free to ask. 😊

Source: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ LLM Model

TextGen API

GitHub:oobabooga/text-generation-webui A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.

Once text-generation-webui is installed and confirmed working via the web interface, please enable the api option either through the web model configuration tab, or by adding the run-time arg --api to your start command.

Source: https://python.langchain.com/docs/integrations/llms/textgen

Start TextGen Web UI API

# Start TextGen Web UI API
source /home/ubuntu/miniconda/bin/activate
conda activate textgen
cd text-generation-webui
python server.py --listen --api
(textgen) ubuntu@ip-172-31-68-63:~/text-generation-webui$ python server.py --listen --api
Starting streaming server at ws://0.0.0.0:5005/api/v1/stream
2023-08-21 05:28:25 INFO:Loading the extension "gallery"...
Starting API at http://0.0.0.0:5000/api
Running on local URL: http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.

OpenAI Extension

An OpenedAI API (openai like) — This extension creates an API that works kind of like openai (ie. api.openai.com).

# Install the requirements
pip install flask_cloudflared==0.0.12 sentence-transformers tiktoken
# enable openai extension
python server.py --extensions openai
OpenAI compatible API
# Sample .env file:
OPENAI_API_KEY=sk-111111111111111111111111111111111111111111111111
OPENAI_API_BASE=http://127.0.0.1:5001/v1

Models

  • TheBloke/vicuna-13b-v1.3.0-GPTQ
  • TheBloke/stable-vicuna-13B-GPTQ
  • TheBloke/airoboros-13B-gpt4–1.3-GPTQ

Useful Links:

Related Articles:

--

--