A perfect Illustration with DALL·E.
Art of Prompt Design: it’s all about YOU.
I daily get requests (since not everybody still has DALL·E access) to create images with specific prompts. One of the most frequent asked questions is the following one:
Q. Can DALL-E create an illustration to a text if you input this text as PROMPT?
My answer should be(even if it sounds rude):
A. No, that isn’t how a PROMPT works.
But things are more complicated than they seem.
Sure, DALL·E is Transformer-driven, so every single part of a prompt you put in will be taken with focussed attention by the model to create a coherent image with inner logic. Sometimes you get some unusual pictures.
For example, for the verse from the song Home Sweet Home by Mötley Crüe:
You know that I’ve seen
Too many romantic dreams
Up in lights, falling off the silver screen.
DALL·E created this fine visual experience:
You don’t really recognize the topics from the textual prompt besides “light” or “silver”, but the vibe, the energy, is here.
Sometimes if you copy-paste the text, you won’t get that something, what you would like to see.
In the case of the Tragic Kingdom by No Doubts:
They pay homage to a king,
Whose dreams are buried in their minds,
His tears are frozen stiff,
Icicles drip from his eyes.
The typical completion will contain a series of “icicles”, historical places, antique kings, and sightseeing. Visually impressive but not really what we are trying to achieve.
In such cases, you finally realize it:
It isn’t just about text. It’s about the vision. It’s about the concept, probably shimmering through the meta level.
It can help in such cases to reach back to an artist. For example, by adding “Painting by Salvador Dalí”, you will get a variety of visions:
Here you see kings, icicles, and eyes, but in such surreal juxtapositions that every such image could serve already as a perfect illustration.
The same is with literary works. By copypasting the final sentences from Hemingway’s The Old Man and the Sea
“Up the road, in his shack, the old man was sleeping again. He was still sleeping on his face and the boy was sitting by him watching him. The old man was dreaming about the lions”
you will get decent photorealistic images of lions. No old men, no boys here.
Does DALL·E fail here? Nope. We fail. The self-attention is focused on lions. We have to provide to DALL·E a proper PROMPT.
So, by applying different artists — and by changing (simplifying) prompts, we get an already emotionally loaded series of art: in our case by Dalí, J.C. Leyendecker, Magritte, and Botticelli:
For these images, I had to adjust the PROMPTs to get it all together.
If we try now different prompts, we will get more and more toward our goal:
The old man is sleeping, he has dream about lions — and a boy sits near the old man. An illustration by Shaun Tan.
Here we still miss the boy, and sometimes it’s the lion who is sleeping. (Shaun Tan, because he is a great illustrator)
Let’s try it with Norman Rockwell and with another change of the prompt
Now we see more coherence, but sometimes its anther composition: a boy is sleeping in bed, a lion is sitting near him.
If we try Norman Rockwell and “oil painting”, we get even better results:
The old man is sleeping in bed and having a dream about lions; and a boy sits near the bed of the old man. An oil painting by Norman Rockwell.
By changing the artist to J.C.Leyendecker, we get a variety of emotional imagery:
Or we can try Jacek Yerka:
…and suddenly the Old Man disappears:
And this is still not the end of the journey. After all, the most decent result would be, if we get rid of style transfers and mentioning existent artists — by creating it just with DALL·E’s own creative capacity (which is endless).
For example, this:
Because now you get the whole picture finally:
The congenial illustration DALL·E cannot be achieved by just reusing the original text as a prompt. You have to become a proactive artist. You have to become creative. You have to get a vision of the final result. And you have to apply your prompt design skills to achieve it.
DALL·E doesn’t mean the end of human creativity. It opens new ways for storytelling. And for creative human-machine collaboration.