Absolute beginner’s guide for Stable Diffusion AI image

Andrew
5 min readNov 12, 2022

--

Absolute beginner's guide for Stable Diffusion AI image

About this Beginner’s guide

This beginner’s guide is for absolute newcomers who have zero experience with Stable Diffusion or other AI image generators.

I will cover:

  • What Stable diffusion can do for you.
  • The quickest way to start generating images.
  • Basics of text prompts and parameters.
  • How to get better at engineering prompts.
  • Common ways to fix defects.

What is Stable Diffusion?

Stable Diffusion is an AI model that generate image from text input. Let’s say if you want to generate images of a gingerbread house, you can put in the prompt:

gingerbread house, diorama, in focus, white background, toast , crunch cereal

The AI model will generates images that matches the prompt:

The advantage of Stable Diffusion is that

  • It is open-source: Many enthusiasts have created free and powerful tools.
  • It is designed for low computational resources: It’s free or cheap to run.

How to start generating images?

Online generator

For absolute beginners, I recommend using a free online generator. Go to one of the sites in the list, putting in the example prompt above and you are in business!

The fastest way to generate high quality images is to reuse existing prompts. Head to the prompt collection section, pick an image you like and steal the prompt!

Alternatively, use image collection sites like Lexica. Pick an image you like and use the prompt.

Treat the prompt as a starting point. Modify to suit your needs.

Advanced GUI

The downside of online generators is that the functionality is pretty limited.

If you outgrew them, you should move on to using a more advanced GUI (Graphical User Interface). I use AUTOMATIC1111 which is a powerful and popular choice. See my guide for setting up in Google Colab cloud server.

With advanced GUI, you can explore fine-tuning with your images by using more advanced techniques like keyword blending and inpainting.

How to build good prompts?

Two advices: (1) Be as detailed and specific, and (2) use powerful keywords.

Be detailed and specific in your prompt

The first rule is be as specific. Stable Diffusion cannot read your mind. You need to describe your image in as much detail as possible.

For example, if you want to generate a picture of a woman sitting on a chair, instead of using the prompt

A woman sitting on a chair

(Prompt #1)

You should use something like:

a young gorgeous woman with long blonde hair sitting on a 19th-century antique chair, beautiful eyes, symmetric face, dedicated modern clothing, photo

(Prompt #2)

See the drastic difference in image quality between the two prompts:

Example images from simplistic prompt #1 (left) and detailed prompt #2 (right).

So work on your prompts before writing off Stable Diffusion!

Use powerful keywords in your prompt

Some keywords are just more powerful than others. Examples are

  • Celebrity name (e.g. Emma Watson)
  • Artist name (e.g. van Gogh)
  • Genre (e.g. illustration, painting, photograph)

Using them carefully can steer the image to the direction you want.

You can learn more about prompt building and example keywords in basics of building prompt.

What do the parameters mean and should I change them?

Most online generators allow you to change a limited set of parameters. Below are some important ones:

  • Image size: The size of output image. The standard size is 512x512 pixels. Changing it to portrait or landscape size can have big impact on the image. For example, use portrait size if you want to generate a full-body image.
  • Sampling steps: Use at least 20 steps. Increase if you see blurry image.
  • CFG scale: Typical value is 7. Increase if you want the image to follow the prompt more.
  • Seed value: -1 if you want a random image. Specify a value if you want the same image.

See recommendations of other settings.

How many images should I generate?

You should always generate multiple images when testing a prompt.

I generate 4 images at a time when making big changes to the prompt, so that I can speed up the search. I would generate 8 at a time when making small changes to increase the chance I would see something usable.

Some prompts only work half of the time or less. So don’t write off a prompt based on one image.

Common ways to fix defects in images

When you see stunning images shared in social media, there’s a good chance that they have undergone series of post-processing steps.

Face restoration

Left: Original images. Right: After face restoration.

It’s well known in the AI artist community that Stable Diffusion is not good at generating faces. Very often the faces generated have artifacts.

We often use image AI models that specifically trained for restoring faces, for example CodeFormer which AUTOMATIC1111 GUI has built in support. See how to turn it on.

Fixing small artifacts with inpainting

It is not easy to get an image you want in one shot. A better approach is to aim at generating an image with good composition and repair the defects.

Below is an example of an image before and after inpainting. Using the original prompt for inpainting would work 90% of the time. Make sure to turn on high resolution fix.

Left: Original image with defects. Right: The face and arm are fixed by inpainting.

Read more about ways to fix common issues.

That’s it for now! Hope you enjoy this article.

This article is original in stable-diffusion-art.com.

--

--

Andrew

I write about AI and internet business. Check out my new stable diffusion site: https://stable-diffusion-art.com/