Training Stable Diffusion in the cloud using RunPod and Kohya SS.

Guillaume Bieler
4 min readMar 1, 2024

--

One of the main challenges when training Stable Diffusion models and making Loras is accessing the right hardware. Most of us don’t have the kind of GPU needed to perform a computationally demanding task like model training, and even for the ones that do, setting up the training script can be complicated. A popular way to bypass this problem is to use a cloud provider like RunPod.

Training with Kohya SS on RunPod

The quickest way to start training on RunPod is to use their Kohya SS template, which can be found here.

From the template page, you can follow these steps:

  • Click on “Deploy to GPU Cloud” (top right)
  • Deploy an RTX 4090 with all the default settings
  • Once you get to this page, wait for the connect button to turn purple while the server boots up (it can take a few minutes).
  • After the server is done booting up, you can connect to Jupyter Lab with Port 8888.
  • Create a password using this token to log in: Jup1t3R!
  • In Jupyter Lab, we recommend creating a folder called “images” in the “kohya_ss/dataset” folder to save your training data. The final path to your image folder should be “kohya_ss/dataset/images/YourImageFolder”.
  • If you want to train on a custom model, rather than vanilla SDXL, you can save it directly as a safetensors file in the “kohya_ss” folder. (Its final path should be “kohya_ss/YourModelName.safetensors”)
  • We also recommend creating a folder called “outputs” in “kohya_ss” to save your results. (Its final path should be “kohya_ss/outputs”)
  • To start the Kohya interface, navigate to the “kohya_ss” folder, and launch a new terminal.
  • In the terminal type:
 bash gui.sh --share
  • You should be able to access the interface through the public URL, which should look like this:
  • If you have a custom model saved in the “kohya_ss” folder, you just have to put its name in the pre-trained model input to use it as a base model for training:

And that is it — you are ready to start training!

If you need more guidance on how to prepare a dataset and what training parameters to use, you should take a look at our comprehensive training guide.

Once the training is complete, your checkpoints will appear in the “kohya_ss/outputs” folder. You can download them from there.

Don’t forget to stop your pod when you are done.

If you find the idea of using Jupyter Lab and having to work directly in the terminal scary, another option to train Stable Diffusion is to use our training web app. Our web app gives you access Kohya directly without having to worry about setting up servers, working in the terminal and preparing your folders in Jupyter Labs.

Pro Tips

To make this process faster, you can move files directly from Google Drive or Civit AI to Jupyter Lab using gdown and wget.

To move a dataset from Google Drive, this is what you have to do:

  • zip your dataset (in this example we will call this file YourDataset.zip, but it can have any name)
  • Get the Google Drive sharing link the way you would normally do it (make sure the link is set up so that the recipient can edit). It should look something like that: https://drive.google.com/file/d/5HMXr0KDpKJSDFu272SqavUllEss1tLxf/view?usp=sharing
  • The folder ID is the long alphanumeric code in the URL. In this example, it is: 5HMXr0KDpKJSDFu272SqavUllEss1tLxf
  • After navigating to the “kohya_ss/dataset” folder in Jupyter Lab, launch a new terminal window and run the following code (make sure to replace the folder ID and the dataset name with your own):
pip install gdown
gdown 5HMXr0KDpKJSDFu272SqavUllEss1tLxf
unzip YourDataset.zip
mkdir images
mv YourDataset images/3_

To move a model to Jupyter Lab from Civit AI, you can follow these steps:

  • Navigate to the model you want to use as a base on Civit AI, right-click on the download button, and select “copy link address”.
  • Launch a new terminal window from the “kohya_ss” folder in Jupyter Lab.
  • You can then download the file directly using wget by running the following code (replace YourLinkAddress with the link you copied from Civit):
wget YourLinkAddress -O YourModelName.safetensors

For any other questions about training and deploying AI solutions, get in touch with the team at www.lightsketch.ai.

--

--