OpenAI’s ‘DALL-E’ Generates Images From Text Descriptions

ExtremeTech
Jan 6 · 3 min read

by Ryan Whitwam

Artificial intelligence has gotten very good at some things — it’s even approaching the capability of people when it comes to recognizing objects and generating text. What about art? OpenAI has devised a new neural network called DALL-E (it’s like Dali with a nod to beloved Pixar robot WALL-E). All you need to do is give DALL-E some instructions, and it can draw an image for you. Sometimes the renderings are little better than fingerpainting, but other times they’re startlingly accurate portrayals.

OpenAI has made news lately for its GPT neural networks, which are sometimes referred to as “fake news generators” because of how well they can make up lies to support the input text. GPT3 showed that large neural networks can complete complex linguistic tasks. The team wanted to see how well such an AI could move between text and images. Like GPT3, DALL-E supports “zero-shot reasoning,” allowing it to generate an answer from a description and cue without any additional training. Unlike GPT, DALL-E is a transformer language model that can accept both text and images as input. DALL-E doesn’t need precise values and instructions like a 3D rendering engine; its past training allows it to fill in the blanks to add details that aren’t stated in the request.

Case in point: See below for some baby penguins wearing Christmas sweaters and playing the guitar. You don’t need to say the penguin has a Santa hat — DALL-E just comes up with that detail on its own in several renderings.

DALL-E also has a better understanding of objects in context compared with other AI artists. For example, you can ask DALL-E for a picture of a phone or vacuum cleaner from a specified period of time, and it understands how those objects have changed. Well, at least generally. Some of the images will have buttons in the wrong place or a bizarre shape. But these are all rendered from scratch in the AI.

That whimsical streak helps DALL-E combine multiple concepts in fascinating ways. When asked to merge a snail and a harp, it comes up with some clever variations on the theme. With more straightforward instructions such as “draw an emoji of a lovestruck avocado,” you get some artful and rather adorable options that Unicode should look at adding to the official emoji list.

The team also showed that DALL-E can combine text instructions and a visual prompt. You can feed it an image and ask for a modification of that same image. For instance, you could show DALL-E a cat and ask for a sketch of the cat. You can also have DALL-E add sunglasses to the cat or make it a different color.

OpenAI has a page where you can play around with some of the more interesting input values. The model is still fairly limited, but this is just the start. OpenAI plans to study how DALL-E could impact the economy (add illustrators to the list of jobs threatened by AI) and the possibility for bias in the outputs.

Now read:

Originally published at https://www.extremetech.com on January 6, 2021.

ExtremeTech Access

All the cutting-edge chip news, software updates, and…

ExtremeTech

Written by

ExtremeTech is the Web’s top destination for news and analysis of emerging science and technology trends, and important software, hardware, and gadgets.

ExtremeTech Access

All the cutting-edge chip news, software updates, and future science of ExtremeTech, distilled into an easy-to-read format.

ExtremeTech

Written by

ExtremeTech is the Web’s top destination for news and analysis of emerging science and technology trends, and important software, hardware, and gadgets.

ExtremeTech Access

All the cutting-edge chip news, software updates, and future science of ExtremeTech, distilled into an easy-to-read format.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store