Prompting on Midjourney
Insights from a former teacher learning Frontend development
The latest and greatest in the technological landscape is Artificial Intelligence. If you’ve successfully navigated the hesitance towards this emerging technology and embraced its inevitability, chances are you’re already a ChatGPT user and have explored other AI technologies.
Within this article, I’ll delve into my approach to generating prompts on MidJourney, based off of their documentation and my own experience through trial and error.
What is a prompt?
According to the MidJourney documentation:
“A Prompt is a short text phrase that the Midjourney Bot interprets to produce an image. The Midjourney Bot breaks down the words and phrases in a prompt into smaller pieces, called tokens, that can be compared to its training data and then used to generate an image.”
How to use Midjourney
Once you sign up on the Midjourney website, you are invited to the Midjourney Discord server. Read over the #rules, visit the #welcome channel and read the directions. You should also take a look in the #getting-started channel. Once you’ve accepted the terms of service you’re ready to create. For your first requests, find a #newbies channel, type in the message bar at the bottom the command /imagine followed by your prompt. For example, if I want to generate an image of a certain infamous King of England then I could use the following prompt:
Prompt:
/imagine Henry VIII sitting on his throne photorealistic
Result:
Each prompt will generate 4 images with a default aspect ratio of 1:1.
What is my Process?
Less goes a long way with MidJourney’s prompts. There are even preset parameters than can be used to simplify your prompt. Here is how I set up a text prompt:
- Description of subject (e.g. Cleopatra, Henry VIII, etc.)
- Concise Details (e.g. walking along the Nile, siting on his throne)
- Type of image desired (e.g. photograph, painting, art nouveau)
- Parameter (e.g. aspect ratio, anime style, version). Take a look at the extensive list of parameters for Midjourney for more.
Text Prompts
For the next request, my subject is a grey tabby cat, details are working from home on a computer, and the type is photorealistic. This time I used a parameter for a widescreen aspect ratio of 7:4.
Prompt:
/imagine grey tabby cat working from home on a computer photorealistic --ar 7:4
Result:
When the results return, there are buttons under the images giving us options of what to do next. The ‘U’ stands for Upscale, the ‘V’ stands for Variant. The numbers refer to the image number, starting with 1 on the top left, then read from left to right by row. I selected ‘U2’ and shortly after Midjourney Bot generated a post with only image #2 on a bigger scale.
For the next prompt I went further with my generated results. This time the subject was Cleopatra, details were walking along the Nile, the style photorealistic, and finally with an aspect ratio of 7:4.
Prompt:
/imagine Cleopatra walking along the Nile photorealistic --ar 7:4
Result:
I decided to upscale image #1 and image #3 by clicking U1 and U3.
I decided to proceed with image #3 by clicking on Vary (Subtle). This will generate a new set of images with small differences.
Result:
I then decided to upscale image #1.
I took it further by Zooming out on the upscale.
When zooming out, Midjourney Bot fills in the frame around the image that is being zoomed out. I decided to upscale image #2 and image #4.
Then on the upscale of image #2, I clicked on the left arrow and the right arrow buttons under the upscale. This generated a new set of images panned to the left of the image and also generated a new set of images panned to the right.
Result:
I clicked left on the upscale of image #4 to see what could be beyond the border of the original image.
Result:
Picture Prompts
It’s also possible to create an image using other images. I use the same elements as a text prompt but with the URL of an image of choice at the beginning of the prompt.
First I used this image of Vincent Van Gogh’s Starry Night and the text prompt “the Eiffel tower”…
Prompt:
/imagine https://en.wikipedia.org/wiki/The_Starry_Night#/media/File:Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg the Eiffel tower
Result:
I then decided to request strong variations of image #2 and image #3 by clicking on V2 and V3 buttons below the images on the Discord post.
The strong variation of image #2 resulted in these new images:
The strong variation of image #3 resulted in these new images:
Certainly, perfection in outcomes isn’t a constant. It appears that the AI-generated depiction of the Eiffel Tower varies in terms of its surroundings. Another example involves utilizing an image within a prompt, where I employed this picture of the New York Skyline featuring the Statue of Liberty, accompanied by the text “Mickey Mouse”.
Result:
From here I decided to upscale image #3 by clicking on the U3 button under the images.
Then I wanted to see what would be generated if I did both a strong variation on the image and a subtle variation. I clciked the buttons and waited.
Observing the images firsthand reveals that while they depict Mickey Mouse statues reminiscent of the Statue of Liberty, they incorporate peculiar elements. Nonetheless, given that the creator was a computer program, the outcomes are quite commendable.
Result:
Errors and Bias
During the assessment of the generated images, potential issues might emerge, including breaks in object continuity, absent or surplus limbs, and the occasional hand-related anomalies like missing or extra digits. Additionally, the diversity within the creations of Midjourney bot might be limited, and there could be a tendency for the AI-generated people to conform to stereotypes, affecting their attire and even the portrayal of their surroundings.
A Picture’s Worth a 1000 Words
The subsequent examples highlight distinctions and resemblances arising from employing an identical prompt with a single keyword alteration. The prompt utilized for these images centers on a scenario where a man and woman stand together. The specifics entail crafting a cover illustration for a romance novel in a fantasy style. In this context, I introduced a parameter for a 2:3 aspect ratio and opted to exclude any artificial titles by incorporating a parameter to exclude text.
Prompt:
/imagine romance novel cover art, man and woman standing together, fantasy --ar 2:3 --no text
I also took turns adding in the keyword “ethnic” or “latin”.
- The first set of images (left) was generated from the original prompt “/imagine romance novel cover art, man and woman standing together, fantasy --ar 2:3 --no text”.
- The set in the middle was generated from “/imagine romance novel cover art, ethnic man and woman standing together, fantasy --ar 2:3 --no text”.
- The last set of image (right), was generated with the prompt “/imagine romance novel cover art, latin man and woman standing together, fantasy --ar 2:3 --no text”.
Result:
Below are the results of a second attempt with the same prompts as above. To be clear, these images did not result from a variant or by clikcing the re-run button. I ran all three versions of the prompts again.
Result:
Using Different Keywords
The prompt used for the second set of images was mostly the same except I traded out the word “fantasy” for the words “photo realistic”.
Prompt:
/imagine romance novel cover art, man and woman standing together, photo realistic --ar 2:3 --no text
I once again took turns adding in the keyword “ethnic” or “latin”.
- The first set of images (left) was generated from the prompt “/imagine romance novel cover art, man and woman standing together, photo realistic --ar 2:3 --no text”.
- The set in the middle was generated from “/imagine romance novel cover art, ethnic man and woman standing together, photo realistic --ar 2:3 --no text”.
- The last set of images (right), was generated with the prompt “/imagine romance novel cover art, latin man and woman standing together, photo realistic --ar 2:3 --no text”.
Result:
I believe the images speak for themselves, underscoring the significance of wording. Given the outcomes, I opted to explore a different keyword modification. I employed the original post again, once with “fantasy” and another with the words “photo realistic,” while this time utilizing the keyword “diverse” to describe the man and woman.
Prompts:
/imagine romance novel cover art, diverse man and woman standing together, fantasy --ar 2:3 --no text
/imagine romance novel cover art, diverse man and woman standing together, photo realistic -- ar 2:3 --no text
Result:
From my perspective, incorporating the term “diverse” yielded in notably better outcomes in both categories. What do you think?