VQ-Diffusion: Microsoft’s New Text-To-Image AI Tool

Jim Clyde Monge
CodeX
Published in
3 min readAug 18, 2022

--

Microsoft’s VQ Diffusion text-to-image AI model
Image created by Jim Clyde Monge

We’re on the cusp of a new era of artistic expression, one in which AI tools will allow us to create realistic and artistically impressive images with ease like never before.

With these AI tools at our disposal, the sky’s the limit when it comes to creating stunning visual masterpieces.

Many tech giants have recently announced their own text-to-image tools.

Now, Microsoft is joining the party with its own version of the text-to-image AI tool — VQ Diffusion.

What Is VQ-Diffusion?

From their official GitHub page:

Vector Quantized Diffusion Model is based on a VQ-VAE whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM). It produces significantly better text-to-image generation results when compared with Autoregressive models with similar numbers of parameters. Compared with previous GAN-based methods, VQ-Diffusion can handle more complex scenes and improve the synthesized image quality by a large margin.

Discrete Diffusion Framework

--

--