[ML Story] DreamBoothing Your Way into Greatness

Published in

Google Developer Experts

4 min readMar 22, 2023

*Image from* *https://unsplash.com/ko/%EC%82%AC%EC%A7%84/Zwvxj3ytTHc*

With the advent of text-to-image generation models (like DALL-E 2, Stable Diffusion, Imagen, and Parti), the definition of “what’s possible” has definitely gotten a new edge. Even though these models often come with impressive capabilities across different facets of creativity, they often lack subject-specific personalization.

Consider the following image as an example:

With the existing systems, it’s challenging to generate the same subject in different contexts while maintaining fidelity and fine-grain details. Here is an example (but for a different subject):

Image from https://dreambooth.github.io/

Even with expensive iterations of fine-tuning, these models fail to produce high-quality generations in targeted and personalized contexts.

Enter DreamBooth

Thankfully, there’s an (inexpensive) way to solve this problem — DreamBooth! DreamBooth was proposed by Ruiz et al. in DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation (CVPR 2023).

DreamBooth introduces a way to steer the generation of these models toward highly specific contexts that closely align with the given subject. Here is one such example:

To know more about DreamBooth, check out the original website. I also encourage you to check out the different use cases made possible by this powerful technique.

Open-sourced DreamBooth

DreamBooth doesn’t have any official public implementation. However, considering its effectiveness, the community soon started using the DreamBooth training technique with the large open-source text-to-image Diffusion model — Stable Diffusion.

Amongst the open-source implementations of DreamBooth, the one provided by the Hugging Face team is quite popular:

diffusers/examples/dreambooth at main · huggingface/diffusers

DreamBooth is a method to personalize text2image models like stable diffusion given just a few(3~5) images of a…

github.com

It supports easy customizability and various optimization techniques such as LoRA, CPU offloading, gradient checkpointing, and more. Soon after the Hugging Face released the training script, the community has been using it in creative ways. Consider the following personalization of Mr. Potato Head:

Prompt used to generate these images: *“A photo of sks mr potato head in a river”*

This inspired me to think about implementing a similar script in TensorFlow using KerasCV’s implementation of Stable Diffusion:

Keras documentation: High-performance image generation using Stable Diffusion in KerasCV

Authors: fchollet, lukewood, divamgupta Date created: 2022/09/25 Last modified: 2022/09/25 Description: Generate new…

keras.io

Implementation in TensorFlow

Besides being one of the few readable open-source implementations of DreamBooth, Hugging Face’s script is also tremendous as educational material.

Chansung Park and I collaborated on this project. We tried to follow the same design principles while implementing it in TensorFlow. After we felt confident about it, we open-sourced it:

GitHub - sayakpaul/dreambooth-keras: Implementation of DreamBooth in KerasCV and TensorFlow.

This repository provides an implementation of DreamBooth using KerasCV and TensorFlow. The implementation is heavily…

github.com

Soon after the implementation was open-sourced, we also decided to blog about our learning journey and published it:

Keras documentation: DreamBooth

Author: Sayak Paul, Chansung Park Date created: 2023/02/01 Last modified: 2023/02/05 Description: Implementing…

keras.io

Since diffusers provides different tools to optimize different aspects of the generation process further, we also worked on a tool that lets you convert the KerasCV Stable Diffusion checkpoints to a format that is compatible with the StableDiffusionPipeline provided by diffusers . Know more about it here:

Using KerasCV Stable Diffusion Checkpoints in Diffusers

This is an experimental feature. KerasCV provides APIs for implementing various computer vision workflows. It also…

huggingface.co

We ran many experiments and included all their details in our GitHub repository. We encourage you to check those out.

Keras DreamBooth Hackathon

My teammates at Hugging Face were happy to see this implementation, and we decided to collaborate with Google to host a community sprint dedicated to it:

keras-dreambooth (Keras Dreambooth Event)

We're on a journey to advance and democratize artificial intelligence through open source and open science.