How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2
Text Generation Web UI (TGWUI/Oobabooga)
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (ggml), Llama models.
How to run Text Generation Web UI (TGWUI/Oobabooga) on Amazon EC2
Step 1 — Launch Amazon EC2 Instance
Create a Amazon EC2 instance using AWS CloudFormation
- Region: us-east-1
- AMI: “ami-0649417d1ede3c91a” # Deep Learning AMI
- Instance: g5.2xlarge
- EBS volume: 500GB
TextGen-WebUI.yaml
AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 Instance
Parameters:
KeyName:
Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
Type: AWS::EC2::KeyPair::KeyName
ConstraintDescription: must be the name of an existing EC2 KeyPair.
Mappings:
RegionToAmiId:
us-east-1:
AMI: ami-0649417d1ede3c91a
Resources:
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub ${AWS::StackName}-sg
GroupDescription: Security group for EC2 instance
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0
EC2Instance:
Type: AWS::EC2::Instance
Properties:
InstanceType: g5.2xlarge
ImageId: !FindInMap [RegionToAmiId, !Ref AWS::Region, AMI]
KeyName: !Ref KeyName
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
VolumeSize: 500
VolumeType: gp2
"Tags" : [
{"Key" : "Name", "Value" : "textgen-webui"},
]
SecurityGroups:
- Ref: SecurityGroup
Outputs:
PublicDNS:
Description: Public DNSName of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicDnsName]
PublicIP:
Description: Public IP address of the newly created EC2 instance
Value: !GetAtt [EC2Instance, PublicIp]
AWS CloudFormation > Create stack
AWS CloudFormation — Step 1 Create stack
Upload TextGen-WebUI.yaml template file and Next.
AWS CloudFormation — Step 2 Specify stack details
Specify Stack name and KeyName and Next.
AWS CloudFormation — Step 3 Configure stack options
Use default settings and Next.
AWS CloudFormation — Step 4 Review and Submit.
Step 2 — Install TextGen Web UI
SSH to Amazon EC2 instance and start JupyterLab
# Start JupyterLab
cd /home/ubuntu
jupyter lab --notebook-dir=/home/ubuntu
Set up SSH tunnel using local port forwarding to JupyterLab
# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 8888:localhost:8888 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com
Open JupyterLab in your local browser
To access the JupyterLab, copy and paste http://127.0.0.1:8888/lab or http://localhost:8888/lab with the token in your local browser after setting up a SSH tunnel.
TextGen Web UI Installation
Install TextGen Web UI from JupyterLab terminal.
# Install Conda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh -b -u -p miniconda
source /home/ubuntu/miniconda/bin/activate
# Create a new conda environment
conda create -n textgen python=3.10.9 -y
conda activate textgen
pip install --quiet ipykernel
python -m ipykernel install --user --name TextGen --display-name TextGen
#Install Pytorch
pip3 install torch torchvision torchaudio
# Install the web UI
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
Step 3 — Start TextGen Web UI
Start TextGen Web UI
# Start TextGen Web UI
source /home/ubuntu/miniconda/bin/activate
conda activate textgen
cd text-generation-webui
python server.py
(textgen) ubuntu@ip-172-31-68-63:~/text-generation-webui$ python server.py
2023-08-21 04:53:17 INFO:Loading the extension "gallery"...
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Set up SSH tunnel using local port forwarding to TextGen Web UI
# Syntax: ssh -L <LOCAL_PORT>:<REMOTE_HOST>:<REMOTE_PORT> <GATEWAY>
ssh -i "us-east-1-key.pem" -N -L 7860:localhost:7860 ubuntu@ec2-###-##-##-###.compute-1.amazonaws.com
Open TextGen Web UI in your local browser
To access the TextGen Web UI, copy and paste http://127.0.0.1:7860/ in your local browser after setting up a SSH tunnel.
Download Open Source LLM Model
Model > Download custom model or LoRA
Model > click refresh.
Model > Model loader — ExLlama and Load.
Prompt template: Llama-2-Chat
[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>
{prompt} [/INST]
- Parameters > max_new_tokens: 2048
- Prompt: ELI5 Generative AI with examples.
Example output of “ELI5 Generative AI with examples.” prompt.
Of course! I’d be happy to help explain generative AI in a safe and respectful manner. 😊
Generative AI refers to a type of artificial intelligence that can create new and original content, such as images, videos, music, text, or even entire conversations. This is different from other types of AI that focus on recognizing patterns or making predictions based on existing data.
Here are some examples of how generative AI works:
1. Image generation: Imagine you have a picture of a cat. Generative AI can take this image and create new, unique pictures of cats that look like they were drawn by a talented artist. These new images could be used in various applications, such as creating realistic-looking avatars for video games or generating artwork for websites.
2. Text generation: Generative AI can write text that sounds like it was written by a human. This could be useful for tasks like writing articles, summarizing long documents, or even composing entire stories. For example, if you gave me a prompt about a fictional world, I could use generative AI to generate an entire story set in that world, complete with characters, plot twists, and descriptions of the setting. 📚
3. Music generation: Imagine you want to create a new song but don’t have any musical skills. Generative AI can help! By analyzing existing songs and understanding the structure of music, generative AI can create entirely new songs that sound like they were composed by a human musician. This could be used for various applications, such as creating background music for videos or generating soundtracks for video games. 🎵
4. Voice generation: Generative AI can also generate realistic-sounding voices, which could be useful in various applications such as voice assistants, language learning tools, or even film and television productions. Imagine being able to create a virtual assistant with a unique personality and tone — it’s possible with generative AI! 💬
These are just a few examples of what generative AI can do. The possibilities are endless, and I hope this explanation has helped you understand the concept better! If you have any questions or need further clarification, feel free to ask. 😊
Source: https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ LLM Model
TextGen API
GitHub:oobabooga/text-generation-webui A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Once
text-generation-webui
is installed and confirmed working via the web interface, please enable theapi
option either through the web model configuration tab, or by adding the run-time arg--api
to your start command.Source: https://python.langchain.com/docs/integrations/llms/textgen
Start TextGen Web UI API
# Start TextGen Web UI API
source /home/ubuntu/miniconda/bin/activate
conda activate textgen
cd text-generation-webui
python server.py --listen --api
(textgen) ubuntu@ip-172-31-68-63:~/text-generation-webui$ python server.py --listen --api
Starting streaming server at ws://0.0.0.0:5005/api/v1/stream
2023-08-21 05:28:25 INFO:Loading the extension "gallery"...
Starting API at http://0.0.0.0:5000/api
Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
OpenAI Extension
An OpenedAI API (openai like) — This extension creates an API that works kind of like openai (ie. api.openai.com).
# Install the requirements
pip install flask_cloudflared==0.0.12 sentence-transformers tiktoken
# enable openai extension
python server.py --extensions openai
# Sample .env file:
OPENAI_API_KEY=sk-111111111111111111111111111111111111111111111111
OPENAI_API_BASE=http://127.0.0.1:5001/v1
Models
- TheBloke/vicuna-13b-v1.3.0-GPTQ
- TheBloke/stable-vicuna-13B-GPTQ
- TheBloke/airoboros-13B-gpt4–1.3-GPTQ
Useful Links:
- https://github.com/oobabooga/text-generation-webui
- https://python.langchain.com/docs/integrations/llms/textgen
- https://github.com/flurb18/AgentOoba
- Extensions
- An OpenedAI API Extension (openai like) — This extension creates an API that works kind of like openai (ie. api.openai.com).