Training DreamBooth on custom models

Daria Wind
PHYGITAL
Published in
6 min readJun 14, 2023

Artists, marketing specialists and designers have been actively using AI, but they might face one problem: how to preserve unique style of the project? How to reach consistency in the results? Today we are going to talk about how to create AI models with a specific person and will teach you how to make such stunning avatars using DreamBooth.

In this article we will describe a new updated pipeline of AI training, give our recommendations concerning usage of custom checkpoints for training and their settings, and share our pipeline for improving the results.

Introduction

Using DreamBooth you can train Stable Diffusion neural network to remember a specific person, object and style, and generate images with them. If you’re interested in any of these particular cases, we highly recommend to watch our tutorial.

The training process on custom models is fairly simple, you need to follow two steps — preparing dataset (images on which AI will be trained on) and setting recommended parameters.

First and foremost, for training on people you need to find at least 5–7 (but optimal number is 15–20) photos of good quality, on which you can clearly see the face and only one person in the frame. For more tips on how to prepare dataset you can refer to this page.

We have selected these images as an example for training:

Training process

So, for training AI we will use Phygital+ — in the node-based web tool you will find only the necessary settings, and it enables both experienced and beginner creators to try out training. The default parameters are set for training on people, giving you a quick start into training process.

Start by adding two nodes — DreamBooth and Import files, then connect these nodes via connecting prolonged orange circle.

Then we need to select the style on which we will train. It’s a relatively new setting in DreamBooth, where you have to select one model from the dropdown options of custom Stable Diffusion checkpoints (pretrained on specific style).

In order to choose the style, you need to ask yourself two questions:

  • which style you want to replicate,
  • which scenarios you are going to generate with your person.

If you’re aimed at recreating photorealism of high quality, we recommend selecting models such as Edge of Realism, Realistic Vision, Deliberate, Lyriel or Epic Realism. These models also perform well in other scenarios such as creating landscapes and abstractions. The advantage of using these models instead of standard Stable Diffusion is higher quality of photorealism and lesser number of artifacts.

If you want to generate in the style of a particular artist or of the movie / TV show, you can try styles like Samdoesarts, Modern Disney, InkPunk or SynthwavePunk. They work the best with people, and in other scenarios you might face some difficulties (in creating objects, for example).

The list of custom models is regularly updated, and we add new styles every week. Also very soon we will release Train AI Panel, in which we will allow people to add their own models. Currently we have more than 40 models for training and generations.

Depending on which style you have chosen, it’s time to set the right parameters, starting with filling in Class images.

If we talk about a photorealistic style, you need to type ‘a photo of a woman, ultra detailed’ or ‘a photo of a man, ultra detailed’. (Pro Tip: you can type any prompt that generates well portrait images in SD 1.5. For example, woman photo, ultra detailed, 8k photo, HD).

If you’re training on a specific artistic style (like Samdoesarts), then after choosing the model you will see a short prompt in Class images with _____. Replace underscores with ‘a portrait of a woman’ or ‘a portrait of a man’. It’s important to leave automatically filled word untouched, otherwise you won’t achieve stylized result.

The only thing left to do is to fill the Subject field. It should contain two parts: a unique name and a gender. It’s crucial to choose the right name that is unknown to Stable Diffusion (Pro Tip: use your name + surname/nickname). In our case we typed DariaWind, a woman.

Now everything is ready for training and you can press Start button. Within 30 minutes you will get your trained model and you will see a banner in DreamBooth node.

Moving on to the most interesting part — generating images with your trained model.

Image generation

As an example we trained neural network based on Dreamlikeart model which aims at creating digital artistic images. In order to start generating with our trained model, we need to add Stable Diffusion 1.5 node, type our Subject name in the Text prompt and add a keyword (usually it’s needed in the models with specified artistic style — if you’re unsure whether you need a keyword, refer to the styles page). Press Start button and wait for your result around 30–40 seconds.

If you don’t like the result, try out one of the following methods:

1.Add negative prompt.

2. Add more keywords to the prompt (you can follow one of our tips and use the tools described in our previous article).

3. Change seed (each seed gives unique result).

4. Change image size (for portraits it’s advised to have vertical resolution like 512x640 or 512x704).

One more tip for improving the results

If you get images in the desired style, but the person is not recognizable, you can try improving the results with variations. Upscale the image you like (hover over it and press Upscale in the menu) up to 1024x1024. It will be our Start Image. Copy the Stable Diffusion node, connect the image from Upscale to the Start image. Press Start.

Now you have an image of higher quality and more recognizable person.

Still getting the bad results? Perhaps your chosen style requires a keyword-trigger. You can double-check if it is true by finding this style in our AI Lirbary page. Also we recommend to check whether your selected images for training meet the requirements.

If you still get some problems with the result, you can reach out to us through social media (Twitter, Telegram, Discord), and we will be glad to help you with the training and share our knowledge! :)

--

--

Daria Wind
PHYGITAL

Technology, education and languages inspired enthusiast. Writing hobbyist. Automation and no-code learner