Prompt Design for DALL·E 2: Series
As GPT-3 was released, one idea became obvious: Prompt Design is everything. GPT-3 as a language model needed a specific approach. It was a communication between humans and AI. You needed a text — you had to inspire AI. Neither force it nor request — but inspire, nudge the Transformer-driven model. You had to be familiar with the generative psychology of the machine if you wanted to get the results you aim.
Prompt Design is everything.
Another revelation became a reality soon: Prompt Designer is a new professional field, a new kind of occupation in the Age of human-machine collaboration. A transformative work between technology, humanities, and arts.
With DALL·E, we not only understand this relevance. We SEE it. Shortly after OpenAI publicly released CLIP in Winter 2021, many CLIP-based Colab Notebooks appeared everywhere (thanks to Advadnoun and other brilliant researchers and artists). They discovered specific modifiers and prompt elements which influenced the image generation.
NOTE:A proper prompt consists at least from two parts:
Content and Modifier.
* Content describes the motifs you want to get from the AI model
* Modifier drives visual features, character, "vibe" of the imageFor example: "A red apple in a hand, Lomography, black&white"
A red apple in a hand is Content
Lomography, black&white is Modifier
Examples for modifiers:
“trending on Artstation” made an image more “trendy”, “Zeitgeist”-alike, less “amateurish” (regarding art, it’s an extensive discussion about aesthetics, indeed). The vibe of best images trending on ArtStation.
“Unreal Engine”, added to a prompt, made the visual completion more dimensional and crisp.
Adding an artist’s name created a “style transfer”, and a generated image tended toward the named artist’s style and essence.
You could also combine the modifies to get something special. For example, “Cat’s restaurant, in da Vinci style, trending on artstation” created for me in Disco Diffusion this masterpiece:
Or “Photorealistic Museum Hall, Unreal Engine” created semi-rendered vision:
There are many excellent compilations with such modifiers like Unlimited Dream Co had written for the VQGAN+CLIP approach:
Writing good VQGAN+CLIP prompts part one - basic prompts and style modifiers
This is the first in a series of guides where I'll go through some techniques on how to write better prompts, improve…
We have to consider here that such modifiers differ for every model trained on a specific labeled and prepared dataset. VQGAN has its own, Midjourney, Disco Diffusion, Pytti, Imagen etc. have their own knowledge.
In my series Prompt Design for DALL·E I want to focus on specifically on this model we are exploring in the official DALL·E Discord channel.
This will be a collection of articles with lists and examples for such modifiers, with permanently updated information.
Because we are still just scratching the surface of Artificial Creativity.