What is DALLE 2? What to Know Before Trying the Groundbreaking AI

How DALL·E 2 transforms text into hyper-accurate images, and how it became an instant phenomenon

Jeremy DiBattista
Geek Culture


Image by Author — Darth Vader in the style of Andy Warhol — Created with DALLE 2

Have you ever been at a party and overheard people talking about the latest advances in Machine Learning? Yeah, me neither — until a few weeks ago when I overheard a large group of people talking about and brandishing pictures from this new algorithm that could turn any text into a picture, and EVERYBODY wanted to try to make their own. Of course, the algorithm I am talking about is DALLE 2.

If you keep up in the ML space, there is no doubt that you have already heard about — possibly even used — DALLE 2, but rarely do algorithms truly escape the tech field like DALLE has. This begs the question — what is DALLE 2? What makes it so special? And most importantly — what can, and more importantly, CAN’T it do?

What is DALLE 2 (The simple answer)?

DALLE 2 is a text-to-image AI system, or a CLIP system (connecting text to images). It is an encoder-decoder model, meaning when text is given, this text is encoded into a machine input, processed by the machine, and then fed through a decoder that decodes this into a visible image. In the Image below, this is the bottom pathway.



Jeremy DiBattista
Geek Culture

I am a Machine Learning Engineer at Spiny.ai, I spend my free time trying to explore problems in data science, ML, and Python!