Diffusion mini-summaries #1 — DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Gowthami Somepalli
ML Summaries
3 min readJan 10, 2023

--

DreamBooth: Assign a rare sequence of tokens as the subject’s identifier and fine-tune the diffusion model on the small set of images with the “subject”.

Paper: https://arxiv.org/abs/2208.12242

The authors use the Imagen model in this paper which uses T5-XXL language model to encode the text guidance to generate a small 64x64 image first and then use a super-resolution model to blow it up to 1024x1024.

The authors observed that fine-tuning all the modules (including the Super-resolution module) results in the best performance.

To prevent overfitting of the model to the small input dataset, the authors use prior-preservation loss, which is like “distillation from the original pretrained model”. In the following loss term, “c” is prompt with the subject’s identifier and c_pr is without it.

So the overall fine-tuning pipeline is as follows. The yellow module refers to the first loss term and the second yellow refers to the second loss term from the above tweet. Once the text2image part is finetuned on the input set of images, the authors then fine-tune the SR module.

The model is quite versatile and can do lots of things, like content change wrt the background or the object itself, style change, and so on.

Prior preservation loss is important, otherwise, the model will output only the “subject” for a given noun.

Failure cases: Might not work on rare categories, might overfit and output the training data for some instances, and appearance might change based on certain text prompts.

Some useful resources: Finetuning your own dreambooth model using huggingface — https://huggingface.co/docs/diffusers/training/dreambooth Unofficial implementation for SD — https://github.com/XavierXiao/Dreambooth-Stable-Diffusion

--

--

Gowthami Somepalli
ML Summaries

Ph.D. student in Computer Science at University of Maryland. Find me at: https://somepago.github.io/