Stable Diffusion beats Dall-E by Large Margin

Surya Chandana
3 min readSep 23, 2022

--

Stable Diffusion: Open source replacement for OpenAi

Introduction

The implementation of denoising autoencoders and diffusion models (DMs) allows for state-of-the-art synthesis results on image data, as well as the use of these models for image modification tasks such as in painting directly without retraining. However, since these models typically operate directly in pixel space, optimization of powerful DMs often consumes hundreds of GPU days and inference is expensive due to sequential evaluations. To enable DM training on limited computational resources while retaining their quality and flexibility, Stable diffusion apply them in the latent space of powerful pretrained autoencoders.

Check thi

Dream Studio Generated Image

Introduction

Stable Diffusion, an AI-generated text-to-image model that will empower billions of people to create stunning art within seconds. It is a breakthrough in speed and quality, meaning that it can run on consumer GPUs. You can see some of the amazing output that has been created by this model without pre or post-processing on this page.

Diffusion models consist of two steps:

Forward Diffusion: Forward Diffusion is a method that maps data to noise by gradually perturbing the input data. This can be achieved by a simple stochastic process that starts with a data sample and iteratively generates noisier samples using a Gaussian kernel. During training, this is done only on the training set; after training it is not used on inference.

Parametrized Reverse: Undoes the forward diffusion, which is a process that performs iterative de-noising and converts random noise into realistic data, for use in data synthesis.

The model itself builds upon the work of the team at CompVis and Runway in their widely used latent diffusion model combined with insights from the conditional diffusion models by our lead generative AI developer Katherine Crowson, Dall-E 2 by Open AI, Imagen by Google Brain and many others.

Dream Studio is a new suite of generative media tools that make it easy for everyone to unleash their creativity by giving them the power to create revolutionary artworks using natural language processing and revolutionary input controls.

You can test Stable Diffusion in hugging face for free. It is a open source program, so you can start from scratch by downloading it from Github.

Stable diffusion VS Dall-E

The key difference between Stable Diffusion and other programs like Google Images is that it uses a very simple algorithm to generate art. The algorithm is based on the concept of stable diffusion, which means that the image will stay the same no matter how many times you change it after generating the initial random image. This makes it easier for people to create stunning art because they don’t need to spend hours tweaking their images before they’re ready to post online or print on T-shirts or anything else they want to display their work.

Different from Dall-E and other methods, Forward Diffusion does not use gradient descent or back propagation algorithm for training. Instead, it uses a simple optimization procedure for finding the hyperparameters of its model.

This method has several advantages over other methods like Stochastic Gradient Descent (SGD): No need to calculate gradient descent gradients; instead, Forward Diffusion tries to find the parameters which maximizes the log likelihood (the log of the probability that an observation belongs to each class).

Conclusion

I believe that the Stable Diffusion model is better than OpenAI Dall-E in some ways, but both are good models in their own way.

Check this for more from us.

--

--

Surya Chandana

Machine learning engineer, living and learning in Artificial Intelligence. You can find me on LinkedIn https://www.linkedin.com/in/chandana-surya-332872193