Prompt Design for DALL·E: Photorealism — Emulating Reality
Ongoing list with modifiers and tips for image generation
First of all, yes, with DALL·E you can achieve stunning photorealistic images. But the question is: what do you mean by this term? “Photorealism”.
Our perception is oversaturated by the media. Our expectations might embrace “the same quality as in real life”. But that's, between you and me, a big lie. A good photograph doesn’t transfer reality from “real life” to the photo paper or digital file. Instead, it’s a staging reality: a specific angle, lighting, lens, etc.
In short, what you see is not “reality”; it’s instead an interpretation of a photograph.
With DALL·E we get an Artificial Interpretation of our world. To stay simple and superficial here, let’s segment “photorealism” into
- Emulating Reality: approach to bring an image most convincingly (aligning with viewers' expectations and experiences)
- Emulating Medium: meta-approach to simulate different photo techniques, cameras, and styles.
A realistic Lomography does not look photorealistic, but it should convince us of its “realism”. And DALL·E can do it.
What’s in a Prompt?
If you enter a Content Prompt without any modifiers, and this content has a relatively objective or figural character, you will get already photorealistic images.
For example, entering “An Apple” will get a series of photorealistic apple images. No more and no less.
Indeed, if you add Modifier “by Magritte”, this supplement will dramatically change the entire character of the prompt:
Things will become complicated if you try to create paradox images, which undoubtedly weren’t within the dataset for DALL·E training, like
A cat driving a bicycle.
Here you see how DALL·E tries to reproduce your prompt but fails. You can help AI by adding an artist modifier:
A cat driving a bicycle, an illustration by Michael Sowa.
Anthropomorphism of animals is typical for book illustrations, so such a task is easy for DALL·E with the appropriate modifier.
Sure, everything is possible — and with the right prompt, you can create a photograph of a cat driving a bicycle, for example, adding a correctional modifier “but as photography”.
A cat driving a bicycle, an illustration by Michael Sowa, but as photography.
Now we have, even if not wholly, almost achieved the photorealism of our demanded vision:
- We created a content (cat on a bicycle)
- We let it fantasize about non-real, absurd situations via an “illustration” trick
- We brought this weird vision back to “photographic” realms by the final modifier.
But what about “photorealism”? About Emulation of Reality?
DALL·E users exchange ideas, observations, and experiences in our hidden Discord. One of the interesting discoveries by DALL·E Discord community was the following:
If you add lens specifications as modifiers, you will get the especially “photorealistic” images, typical for photography shoots with these specifications. Either the training dataset for DALL·E was very well labeled, or it even considered meta-data in the image files.
Here are lens examples (thank you, Sharif).
sigma 85mm f/1.4 - good for a portrait lens
Attention: due to TOS, we don’t publish photorealistic human portraits. But we can do it with animals and objects.
A portrait of a dog in a library, Sigma 85mm f/1.4A bitten-into apple hanging on branch of an apple tree, Sigma 85mm f/1.4A plastic cup on sidewalk of a big city, Sigma 85mm f/1.4
This is what “photorealism” look like. You literally can see every hair in his fur. And the library background is a gorgeous bokeh.
Sigma 85mm f/8 — less depth of field, and sharper background (less bokeh)
A portrait of a dog in a library, Sigma 85mm f/8A bitten-into apple hanging on branch of an apple tree, Sigma 85mm f/8A plastic cup on sidewalk of a big city, Sigma 85mm f/8
Mind btw how the background is shimmering through the translucent plastic cup.
Sigma 24mm f/8 — wider angle, smaller focal length
A portrait of a dog in a library, Sigma 24mm f/8A bitten-into apple hanging on branch of an apple tree, Sigma 24mm f/8A plastic cup on sidewalk of a big city, Sigma 24mm f/8
Sigma 24mm f/8, 1/10 sec shutter — motion blur, slower shutter speed
If you want to capture somebody in movement, this is the right setting.
Running dog in a library, Sigma 24mm f/8, 1/10 sec shutterA bitten-into apple fluttering in the strong wind on branch of an apple tree, in motion blur, Sigma 24mm f/8, 1/10 sec shutterA plastic cup is drifted by wind on sidewalk of a big city, Sigma 24mm f/8, 1/10 sec shutter
Note that interestingly DALL-E hesitated to blur the apple, we have to explicitly add “in motion blur” for more movement. Probably there were not too many blurred apple images in the dataset (since we human sorted them out previously as “unsuccessful shot”)
Sigma 24mm f/8 1/1000 sec shutter — movement, but sharp image — with higher shutter speed.
Running dog in a library, Sigma 24mm f/8 1/1000 sec shutterA bitten-into apple, captured in the moment of falling down, Sigma 24mm f/8, 1/10 sec shutterA plastic cup with liquid being captured in the moment of being overturned by wind on sidewalk of a big city, Sigma 24mm f/8 1/1000 sec shutter
Interestingly, in the case of the dog image, here we see a phenomenon of disintegration — the image is sharp but losing its “photorealism”.
Looking for a photo meta-data might bring you more ideas about achieving the quality you want. For example, using this architectural setting, you can re-create convincing interior photos:
Interior of a bright apartment with bookshelves, paintings and window looking to the megapolis, Nikon D810 | ISO 64 | focal length 20mm (Voigtländer 20mm f3.5) | Aperture f/9 | Exposure Time 1/40 Sec (DRI)
Finding the proper settings.
Using popular photo websites like Unsplash or Flickr, you can learn more about settings since the meta-data is always included within the image description. For example, this wonderful photo of Japanese Momiji:
According to Flickr, the following Camera + settings were in use:
So let’s try to reproduce the motif and settings:
Autumn Momiji, Nikon D810, ƒ/2.5, focal length: 85.0 mm, exposure time: 1/800, ISO: 200
Or let’s create a photo of dancing people, like in this photo:
Dancing people, in the evening, with flash. (Attention: no photorealistic faces, please, so: “seen from the back”)
Dancing people in the evening, seen from back, sunset, Canon EOS 1000D, ƒ/3.5, Focal length: 18.0 mm, Exposure time: 1/5, ISO 400, Flash on.
If you want to create a night photo of a car with light streaks, you have to work on ISO:
A car passes the photographer in the night with lights, seen from outside, 24 mm, f8, 1.6 s, ISO 1000
Telephoto? Of course!
This wonderful moon photo was taken with the following settings:
Let’s try to make it more interesting and add a bird
Photo of a moon with a bird flying in the foreground, Canon EOS Digital Rebel XTi, 100-300mm Canon f/5.6, Exposure time: 1/160, ISO 400
You can endlessly try out different lenses, apertures, and ISO values. The main thing is your idea and concept of what and how it should look like.
Another great trick is using the modifier “studio light”.
Just compare the prompt “An Apple”
and the prompt “An Apple, Studio light”.
Every ridiculous and boring object (sorry, Apple) becomes profound and visually striking.
I suppose, in the dataset, there were so many studio photographs that DALL·E knows meanwhile how to create a perfect image.
We are still at the beginning. As you see, DALL·E can reproduce “photorealistic” images in very manifold and interesting ways (in the meaning “emulated reality”).
This article will be updated — and new chapters will also be added (Follow me on Twitter at Merzmensch for updates).
In the next chapter, we will see if DALL·E can simulate different photo technology (spoiler: yes, it can).