DALL·E mini: The AI image generator taking over the Internet

Doctor Yak

Follow

Published in

The Yak

4 min readJun 18, 2022

--

A playground for the surreal, built from artificial intelligence

“He nonchalantly laughed at CEO of Maryland cookies, and no one could doubt that he was indeed the culprit of the great cookie decimation.” Credit: Own creation using DALL·E mini

Over the last few weeks, the free website and tool DALL·E mini has taken over the internet. Anyone can go on the site and type in a prompt from whatever their imagination desires and hit the button “run” — with results from the hilarious to the unearthly. From a demigorgon playing basketball to a travel poster of Asgard, the range of AI-generated images is gargantuan.

A humongous potential for entertainment: Godzilla in Victorian England. Credit: DALL·E mini

The tool generates nine images, with different levels of accuracy. They are rarely entirely photorealistic in the same way as the eerie This Person Does Not Exist fake person generator. Nevertheless the meme-worthy generations are rarely uninteresting.

Mark Zuckerberg as a ventroliquist’s dummy. Credit: DALL·E mini

The project is the brainchild of a machine learning engineer called Boris Dayma, hailing from Houston.

“Being able to create an image that looks like what you wanted, on the technical level, to me, it was very interesting. I want to be able to try it out myself, and I want to be able to let other people use it.” Boris Dayma, creator of DALL·E mini

A travel poster for Narnia. Credit: DALL·E mini

Dayma was inspired by the original DALL·E, an OpenAI program which converts text-to-image, and wanted to create a version which is open to the public. Alongside a team of contributors, Dayma sought to develop a community-driven project.

The tool works on the same pattern-recognition framework as other AI platforms, processing images and captions and detecting associations, gradually improving over time. The architecture of the program needs to first understand and interpret the text prompt (an encoding process) and then generate images which accurately represent these interpretations (a decoding procress).

The encoder used is based on a platform called BART, which tokenizes the data extracted from a text prompt. This is then decoded using a VQGAN model and pixels in an image are reconstructed.

The key element, similar to machine learning in so many other areas, centres on the honing of this process with training… Fascinatingly, the website has been available for a year, but only over the last fortnight has its popularity skyrocketed, with many users struggling to generate images because of bandwidth problems. The increase in users is no doubt linked to the improvement of the program as the model is trained to produce better images.

“Over time, it becomes better and better”

Capybara in a maid costume. Credit: DALL·E mini

The platform is not without its limitations, and this is made very clear by the creator on the DALL·E mini website:

Bias and Limitations, from the DALL·E mini website

There is a great concern in AI technology, whatever the usage, that biases can be promulgated and disinformation promoted. Dayma hopes that people become aware of this as they get more practised with artificial intelligence:

“People can learn about the limitations of the model, the biases… At least people can learn that that type of thing is coming, and now you need to be aware with the content that you see online. I hope it helps people develop their critical thinking.”

Surreal food for the imagination. Credit: DALL·E mini

Another question arises: If artificial intelligence is used to create images, can the final product really be thought of as art? The Alan Turing institute seeks to explore the interface between AI, data science and the arts. Can an algorithm ever appreciate the social context of art, truly telling stories or making sense of the world…?