How to Run Stable Diffusion on EC2

Use Meadowrun to run the latest text-to-image model on AWS

Meadowrun
6 min readSep 2, 2022

Stable Diffusion is a new, open text-to-image model from Stability.ai which has blown people away.

The publicly available official tool gives you 200 image generations for free, and then charges about 1¢ per image generation after that. But because the model is open, you can download the code and the model and run your own version of it. The r/StableDiffusion subreddit has a good guide to doing this, and the options boil down to using Google Colab which requires a Colab Pro subscription ($9.99/month) to get enough GPUs, or running locally on your laptop which requires a GPU with at least 10GB of VRAM.

We’ll present a different option here, which is to use Meadowrun to rent a GPU machine from AWS EC2 for just a few minutes at a time. Meadowrun is an open source library that makes it easy to run your python code on the cloud. It will take care of launching an EC2 instance, getting our code and libraries onto it, and turning it off when we’re done.

Images generated by Stable Diffusion from “a digital illustration of a steampunk computer floating among clouds, detailed”

AWS and Meadowrun Prerequisites

To get started, you’ll need an AWS account, a local Python environment with Meadowrun, and then install Meadowrun in your AWS account. Here’s an example using pip in Linux, assuming your AWS account is set up:

$ python3 -m venv meadowrun-venv
$ source meadowrun-venv/bin/activate
$ pip install meadowrun
$ meadowrun-manage-ec2 install --allow-authorize-ip

There’s a detailed guide in Meadowrun’s documentation.

If you’ve never used GPU instances in AWS before, you’ll probably need to increase your quotas. AWS accounts have quotas in each region that limit how many CPUs of a particular instance type you can run at once. There are 4 quotas for GPU instances:

These are all set to 0 for a new EC2 account, so if you try to run the code below, you’ll get this message from Meadowrun:

Unable to launch new g4dn.xlarge spot instances due to the L-3819A6DF quota which is set to 0. This means you cannot have more than 0 CPUs across all of your spot instances from the g, vt instance families. This quota is currently met. Run `aws service-quotas request-service-quota-increase --service-code ec2 --quota-code L-3819A6DF --desired-value X` to set the quota to X, where X is larger than the current quota. (Note that terminated instances sometimes count against this limit: https://stackoverflow.com/a/54538652/908704 Also, quota increases are not granted immediately.)

We recommend running the command in that message or clicking on one of the links in the list above to request a quota increase if you’re giving this a go (if you use a link, make sure you are in the same region as your AWS CLI as given by aws configure get region). It seems like AWS has a human in the loop for granting quota increases, and in our experience it can take up to a day or two to get a quota increase granted.

Stable Diffusion Prerequisites

Next, we’ll need to go to the Stable Diffusion page on Hugging Face, accept the terms, and download the checkpoint file containing the model weights to our local machine.

Then, we’ll create an S3 bucket and upload this file to our new bucket so that our EC2 instance can access this file. From the directory where the checkpoint file was downloaded, we’ll run:

aws s3 mb s3://meadowrun-sd
aws s3 cp sd-v1-4.ckpt s3://meadowrun-sd

Remember that S3 bucket names are globally unique, so you’ll need to use a unique bucket name that’s different from what we’re using here (meadowrun-sd).

Finally, we’ll need to grant access to this bucket for the Meadowrun-launched EC2 instances:

meadowrun-manage-ec2 grant-permission-to-s3-bucket meadowrun-sd

Running Stable Diffusion

Now we’re ready to run Stable Diffusion!

Gentle reminder: if you’re following along, replace the S3 bucket name meadowrun-sd in the snippet with the name you chose earlier.

Let’s walk through this snippet. The first parameter to run_command tells Meadowrun what we want to run on the remote machine. In this case we’re using bash to chain three commands together:

  • First, we’ll use aws s3 sync to download the weights from S3. Our command will run in a container, but the /var/meadowrun/machine_cache folder that we download into can be used to cache data for multiple jobs that run on the same instance. aws s3 cp doesn’t have a --no-overwrite option, so we use aws s3 sync to only download the file if we don’t already have it. This isn’t robust to multiple processes running concurrently on the same machine, but in this case we’re only running one command at a time.
  • Second, we’ll run the txt2img.py script which will generate images from the prompt we specify.
  • The last part of our command will then upload the outputs of the txt2img.py script into our same S3 bucket.

The next two parameters tell Meadowrun what kind of instance we need to run our code:

  • AllocCloudInstance("EC2") tells Meadowrun to provision an EC2 instance.
  • Resources tells Meadowrun the requirements for the EC2 instance. In this case we’re requiring at least 1 CPU, 8 GB of main memory, and 10GB of GPU memory on an Nvidia GPU. We also set max_eviction_rate to 80 which means we’re okay with spot instances up to an 80% chance of interruption. The GPU instances we’re using are fairly popular, so if our instance is interrupted or evicted frequently, we might need to switch to an on-demand instance by setting this parameter to 0.

Finally, Deployment.git_repo specifies our python dependencies:

  • The first two parameters tell Meadowrun to get the code from the meadowrun-compatibility branch of this fork of the official repo. We were almost able to use the original repo as-is, but we had to make a small tweak to the environment.yaml file — Meadowrun doesn’t yet support installing the current code as an editable pip package.
  • The third parameter tells Meadowrun to create a conda environment based on the packages specified in the environment.yaml file in the repo.
  • We also need to tell Meadowrun to install awscli, which is a non-conda dependency installed via apt. We’re using the AWS CLI to download and upload files to/from S3.
  • The last parameter sets the TRANSFORMERS_CACHE environment variable. Stable Diffusion uses Hugging Face’s transformers library which downloads model weights. This environment variable points transformers to the /var/meadowrun/machine_cache folder so that we can reuse this cache across runs.

To walk through selected parts of the output, first Meadowrun tells us everything we need to know about the instance it started for this job and how much it will cost us (only 16¢ per hour for the spot instance! If we need the on-demand instance it will cost 53¢ per hour).

Launched a new instance for the job: ec2-3-15-146-110.us-east-2.compute.amazonaws.com: g4dn.xlarge (4.0 CPU, 16.0 GB, 1.0 GPU), spot ($0.1578/hr, 61.0% eviction rate), will run 1 workers

Next, Meadowrun builds a container based on the contents of the environment.yaml file we specified. This takes a while, but Meadowrun caches the image in ECR for us so this only needs to happen once. Meadowrun also cleans up the image if we don’t use it for a while.

Building python environment in container a07bf5...

After that, we’ll see the output from the txt2img.py script:

Global seed set to 42
Loading model from /var/meadowrun/machine_cache/sd-v1-4.ckpt
...

This script usually around 3 minutes and generates 6 images with the default settings. This adds up to about 6 images for 1¢ with a spot instance, and about 2 images for 1¢ with an on-demand instance, although we do have to pay for some overhead for creating the environment.

UPDATE (15 Sep 22): Even on a warm machine (i.e. with container image loaded) it takes some time for the model to load and start, which happens on every invocation. It should be possible to run stable diffusion as a service via e.g. https://github.com/sd-webui/stable-diffusion-webui so you pay this delay only once. As inspiration, our other article on running DALL-E Mini in EC2 does something similar.

Once the last command completes, our images will be available in our S3 bucket! We can view them using an S3 UI like CyberDuck or just sync the bucket to our local machine using the command line:

aws s3 sync s3://meadowrun-sd/steampunk_computer steampunk_computer

Meadowrun will automatically turn off the machine if we don’t use it for 5 minutes, but if we know we’re done generating images, we can turn it off manually from the command line:

meadowrun-manage-ec2 clean

Closing remarks

Stable Diffusion is remarkable for how good it is, how open it is, and how cheap and easy it is to use. And Meadowrun makes it even easier!

To stay updated on Meadowrun, star us on Github or follow us on Twitter!

--

--