The Best Way to Generate Midjourney Prompts in GPT (what it can and can’t do)

Alex Tully
7 min readSep 24, 2023

--

There’s a lot of talk online about using GPT to generate prompts for Midjourney and other AI art generators. But the approach is almost always to use one complex set of instructions for GPT. I used this myself to make some images for work (and for some of the images in my head-to-head of Midjourney vs. Jasper Art). But although my GPT prompt blew out to over a page long, I was still having to heavily rework the output before running it in Midjourney. I figured there had to be a better way.

The solution is to go back to what used to be established wisdom in the AI world, but which has been drowned out since ChatGPT went viral: If you want AI to do something, the best way (if you can do it) is to give it examples of what you want.

To begin with, I wanted to see if I could use this approach to generate prompts for one specific genre. For this I chose landscape photography. I went through some of my Midjourney creations in this genre that appealed to me, copied and pasted the prompts into one big list, then augmented it with some more Midjourney prompts for landscape photos that I saw online and liked. Then I created a GPT prompt around that. After a bit of tweaking, this prompt was (written in italics below, places where you need to put stuff are <IN ANGLE BRACKETS, CAPITALISED AND BOLDED, LIKE THIS>):

###Begin Example Midjourney Prompts###
<PASTE LIST OF MIDJOURNEY PROMPTS HERE>
###End Example Midjourney Prompts###

The above is a list of example prompts for the Midjourney AI art generator. Generate a long list of more similar prompts to generate AI art relevant to the following scenario:

###Begin Description of Scenario###
<TYPE DESCRIPTION OF DESIRED PHOTOS HERE>
###End Description of Scenario###

Do this by the following procedure:

1) Imagine a scene from the scenario
2) For each major element in the scene, list some emotions that a viewer might feel upon viewing the scene
3) Create a list of emotions that is the intersection of all the lists in #2
4) For each emotion on the list in #3, list some photography or artistic techniques to generate that emotion
5) Choose a coherent selection of techniques from #4
6) Construct a Midjourney prompt from the results of #1 and #5

Don’t limit yourself to the exact vocabulary used in the provided examples, but mimic the style of prompt construction.

If you want to specify that the image should lack <NP>, then the only way you should do this is by appending the prompts with — no <NP>

If you describe humans or things built by humans, then be sure to include relevant information in your prompts that describes the technological level of humans at that time.

If your prompt refers to a real-life location that is dramatically changed in the scenario compared to present-day Earth (e.g. it was a coastline but is now inland, or it was covered by ice sheets which have gone), then be sure to make the prompt reflect this by including keywords, or by appending information after — no

If the prompt includes pairs or triplets of words that are very commonly used together as a single phrase, then link the words with an underscore rather than a space or a hyphen e.g.

saturated colours -> saturated_colours
tree ferns -> tree_ferns

For each prompt, walk through the stages of construction described above, and then explain how it is relevant to the provided scenario, and also what you think would be good about the image resulting from your choice of vocabulary in your prompt.

Getting an LLM to reason through several steps is a well-known prompt engineering technique to improve the quality of outputs, called Chain of Thought. It also parallels what computer programmers do when they make programs generate a log of every step that they work through, in order to locate bugs. Since generative AI is prone to hallucinations, I hoped to catch them with a similar technique.

And by the way, the reason for putting underscores between some words in prompts is explained here.

For the first scenario, I just typed in “Koh Mak”. Here’s the start of GPT’s output (I used GPT-4):

Screenshot of GPT-4 Output

GPT gave back 2 more prompts, but I only ran the first.

Made in Midjourney

Reminds me of when I was there earlier this year before Thailand’s rainy season kicked in.

What about a fictional landscape? I deleted “Koh Mak” in the prompt and replaced it with “J.G. Ballard’s The Drowned World”.

Screenshot of GPT-4 Output

Again it gave me a few more prompts. Running the first one (screenshotted above) yielded:

Made in Midjourney

No reptiles, but I suppose I could inpaint them in if I want.

How would it do with science-fiction set on other planets? I typed in “Sentenced to Prism”. Here’s the beginning of GPT’s output:

Screenshot of GPT-4 Output

Running the first prompt yielded:

Made in Midjourney

The top right one looks good to me. Some of GPT’s other prompts also looked like they could yield nice pictures. But for the purposes of this post, I just always ran the first prompt Midjourney suggested.

Time to expand to other genres of photography. I beefed up the list of prompts with prompts for other types of photos. It was getting a bit long and I was worried about GPT’s token limit, so I saved the list of prompts as a PDF and imported it with the Ask Your PDF plug-in.

If you don’t know what this is: It’s a nifty tool that lets GPT access the contest of a PDF file. It works as follows:

  1. You upload a PDF file to their site.
  2. Once you’ve done that, the site gives you an ID number
  3. You can then paste into a GPT-4 prompt (unfortunately plug-ins don’t work with the free version).
  4. GPT-4 can now look through the PDF file in order to compose its answer.

Anyway, in my own GPT-4 prompt I replaced the list of example prompts with:

<INSERT ID OF UPLOADED PDF FILE HERE>
Access the above through the Ask Your PDF plug-in. It is a list of example prompts for the Midjourney AI art generator.

I changed the scenario to: “Slaaneshi worshippers from Warhammer 40k time travel to present-day Earth and found a cult”. GPT worked in the same way as before:

Screenshot of GPT-4 Output

Running the prompt yielded:

Made in Midjourney

Time for something analagous from the real world. I changed the scenario to “urban decay and the heroin epidemic in Footscray”.

Screenshot of Midjourney output

Running the prompt yielded:

Made in Midjourney

I also got good results setting the results to “Tsqaltubo” (The Wikipedia article doesn’t do this place justice. The Youtuber Bald and Bankrupt put it on my bucket list with this video, and I ticked it off in 2022):

Screenshot of GPT-4 Output

Running the prompt yielded:

Made in Midjourney

Some of these look better preserved than the real Tsqaltubo, but I’m happy with the results.

My last scenario was “Steve Jobs at Kokedera”:

Screenshot of GPT-4 Output
Made in Midjourney

Some of these would need inpainting to fix glitches in the hands, but the settings looked nice.

Anyway, here marks the limits of what GPT could do. Once I started inputting more complicated scenarios into the prompt, it started returning Midjourney prompts that were sometimes inconsistent with the initial scenario. So while this GPT technique might be good to just whip up things quickly, it’s not going to produce works of art.

Other Notes:

  • I only explored photography in this. But it should be simple to adapt this to other kinds of art, with just a few easy tweaks.
  • If you aren’t paying for GPT-4, then you can’t import PDF files. But you could probably get good results by providing GPT with a varied mix of example Midjourney prompts.
  • All the images were generated with my Midjourney defaults, which were a 1:1 aspect ratio, — style raw and — stylize 50. I suggest reading up on these if you don’t know what they are.

Good luck using it!

--

--

Alex Tully

Into Generative AI, but 100% Human-Written Blog (every word)・Bachelor’s in Maths・Master’s in Linguistics (@ANU 🇦🇺 )・Taught myself 🇯🇵 and 🇹🇭・Digital Nomad