Are they really any good? Which one is best? Let’s compare Dall-E 2, Stable Diffusion, and MidJourney
“a fat dog sitting on a baby who is smoking a cigar, in the style of Annie Leibowitz” — DALL-E 2
Warning: This post will get very weird, very fast. If you work for really serious and boring people, then this post might be NSFW.
*** Read to the end and take the poll you find there!
Introduction
A.I., deep neural nets, and related fields are influencing humanity and society as we speak — right now, in real time. The second and third order effects of all of this stuff are going to be fascinating. If you told a knowledgeable computer scientist ten years ago that an AI would be able to teach itself Go and beat a world champion, they might not believe you. There are crazy things happening right now and it is tremendously exciting to watch.
In the Mobile Mapping field, AI is accelerating the general trend of automation, and of doing repetitive tasks which humans are pretty good at, but which machines can do faster and in a more scalable manner. We can now anonymize and label images better than ever.
In the autonomous driving field, the exact role and type of AI that will help to solve it is still very much up for debate, and with billions of dollars flowing into this field by many fierce competitors, it’s something that we are going to see solved sooner or later — in two years or in twenty years? That’s the question.
Moore’s Law might be tapering off, but that doesn’t mean the acceleration of technology is stopping. The idea that you can write a sentence and get a beautiful picture is another one of those things which seemed impossible… just last year!
Unless you’ve been living under a rock, like this guy,
“a comic book drawing of a person living under a rock” — Dall-E 2
Then you have seen the profusion of AI-generated imagery pouring out of the internet. If you have been living under a rock, maybe watch John Oliver’s video about it, which is really funny.
What is an AI image generator? Where is it coming from?
Here, dear Reader, we will elaborate on this topic, and show you a few samples.
It has been possible for a few years now for a neural network to generate an image based on a text prompt. But it was only this year, and really only a few months ago, that these neural networks have become freakishly good, at least in some cherry-picked examples, at making interesting, cool, and even beautiful images.
There are, as of this writing, three different entities which are publicly available in some form and capable of making some awesome illustrations and photos: Dall-E 2, Stable Diffusion, and MidJourney.
The oversimplified layman’s explanation of how they work? Engineers feed hundreds of millions of annotated images into a neural network, in order to “teach” this neural network how to generate a unique picture, given a unique text prompt. If you want to learn more about how it really works, please look somewhere else. This article is not about that. We are here to compare them. Call it an AI image generator death match, if you will.
I’m going to feed the same sentence into each of these three image generators, and we will see what comes out. In the end, we can discuss which one is the best.
A few caveats: both Dall-E and MidJourney suffer from “American Protestant Censorship Syndrome”, where it’s not allowed to do anything naughty. Not even a giant hairy eyeball crushing a city:
No nudity, no celebrities, no cannibalism, you get the picture.
Stable Diffusion, on the other hand, has no issue with that, so it is easy to create a Charles Manson / Donald Trump hybrid for example:
I mean honestly, one could amuse oneself all day generating AI images of Trump, right? Anyway, we digress.
OK, let’s get started
So, we’re going to feed the same sentence into each of the three AI image generators. You can decide what you like. At the end, please vote, and we’ll share the results later!
To keep things simple and fair, we will avoid any prompts that are banned in the previously mentioned American Puritanical AI bots (Dall-E 2, MidJourney)
We’ll start with easy, unimaginative stuff, and then we’ll get progressively more… weird.
Photo of an apple on a white tablecloth
Stable Diffusion
MidJourney
Dall-E 2
Photo of a person wearing a tuxedo, kicking a snowman
Stable Diffusion
MidJourney
Dall-E 2
Two beautiful policemen, male and female, standing in front of the Eiffel tower, arresting a capybara, in the style of Marc Chagall
Stable Diffusion
MidJourney
Dall-E 2
Painting of a sausage passionately arguing with a fish
Stable Diffusion
MidJourney
Dall-E 2
A reptilian president of the united states, laying on the beach. Hyper realistic photo
Stable Diffusion
MidJourney
Dall-E 2
An AI image generator being controlled by some weirdo, pencil sketch
Stable Diffusion
MidJourney
Dall-E 2
Photo of a flaming dog excreting pencils onto the Las Vegas skyline at sunset
Stable Diffusion
MidJourney
Dall-E 2
A fat baby smoking a cigar, in the style of Richard Avedon
Stable Diffusion
MidJourney
Dall-E 2
Photo of John Rambo eating a banana
Stable Diffusion
MidJourney
Dall-E 2
Photo of a dog-headed human taking a bath with a giant roach in a cowboy saloon
Stable Diffusion
MidJourney
Dall-E 2
D deity praising its subjects and ensuring global happiness
Stable Diffusion
MidJourney
Dall-E 2
Ultra realistic photo of a 250 year-old woman dancing with an 11 year-old boy
Stable Diffusion
MidJourney
Dall-E 2
Photo of a man with three eyes
Stable diffusion
MidJourney
Dall-E 2
Dancing celery on top of a taco truck
Stable Diffusion
MidJourney
Dall-E 2
Two bananas in a violent confrontation
Stable Diffusion
MidJourney
Dall-E 2
Photo of a shark eating a cheeseburger
Stable diffusion
MidJourney
Dall-E 2
Now, it’s time to vote.
Be sure to vote on the one that actually follows the instructions, as well as making the best-looking images: