The 5-Step Midjourney Method (and the inpainting hack I stumbled on while using it)

10 min readOct 10, 2023

It’s not how most people use Midjourney, but I’ve settled on the following 5-step approach to making AI art:

Generate an “almost blank canvas”. What kind of lighting will there be? Where will it come from? What style is the image? (colour schemes, angle, photo vs. painting vs illustration etc.)
Specify that “bare bones” of the image. Where is there sky? Where is there land or other large features that occupy a large portion of the piece? Maybe you can generate that in tandem with #1, but not, then insert it with Vary (Region).
Inpaint in whatever will be on the bare bones of the image. Vegetation? Asphalt? Ice?
Inpaint in the smaller details.
If necessary, inpaint over surfaces that will have reflections or shadows of what you added in during the previous stages.

This does not necessarily correspond to 5 inpaints. Sometimes it might be possible to clear two stages in a single rendering. On the flip side, sometimes you might need multiple inpaints to clear a single stage. The important thing is to follow the above order. Doing so provides the following advantages:

It lets you get more elements into your picture, by splitting those elements between multiple prompts. This is advantageous because the longer you make a single Midjourney prompt, the more likely it is to start ignoring some of what you inputted. You can minimize this problem by strategically inserting underscores between words in your prompts (please see here), but that only increases the limit on what you can put into a Midjourney prompt. It does not eliminate it.
You’ll gain greater control over the style of image you create (colours, lighting etc.). This is because you can use more keywords to specify these things (and have Midjourney recognise them) since keywords for physical elements of your piece will be clogging up less “bandwidth” in each prompt.
It lets you specify exactly where those physical elements are in your piece.

Anyway, I’ll give an example of a piece I created using this technique, and a surprising discovery I made about inpainting in the very last stage.

I wanted to create another post-apocalyptic piece inspired by J.G. Ballard’s The Drowned World. I’ll explain the book once sentence if you haven’t read it: The sun becomes brighter and dramatically changes Earth’s geography.

The book was set a generation after the change, but what about in the far future? Eventually the polar regions would cease to be rocky wastelands and would come to life. What would they look like?

Now Antarctica sports a mountain range comparable to the Andes. Known as the Transantarctic Mountains, they look spectacular poking out of the ice cap, and I envisioned that they’d look pretty good in a warmer climate too.

Screenshot of Google Images search results

I used Build Your Own Earth to model the climate, which returned a curious prediction: A warmer Antarctica would be covered with a permanent temperature inversion. This means that the air would get warmer as you go up the slopes of mountains, leading to very photogenic cloud patterns when viewed from above.

Build Your Own Earth also predicted that, all year round, this continent would have an extremely humid climate, and receive rainfall comparable to what tropical areas get in their monsoon seasons. So I wanted subtropical jungle, and also rice terraces on the mountains, since there’d be plenty of rain to satisfy even these thirsty plants.

Lastly, I wanted an underground city dug into the mountains, among the rice terraces. Build Your Own Earth was predicting lots of directional wind shear, that plus the humidity plus the Coriolis force (strongest at the poles) would spawn some pretty epic supercell thunderstorms and tornadoes whenever anything broke the temperature inversion and released all the pent-up energy inside.

Next I needed to think about the styles to generate the image in, and the moods I could evoke. In such a high latitude the sunlight would need to come in at a low angle. But beyond this I had the equivalent of writer’s block. How to resolve it?

Previously I blogged about the GPT prompt I used to generate Midjourney prompts, and I typed in my scenario to get some ideas. Now for complex scenarios (like this) you can’t just copy and paste the results from that GPT prompt and run them in Midjourney. But GPT’s output did suggest one idea that appealed to me (as well as many that didn’t): I could create a wild, rugged mood using cold, desaturated colours, plus a wide angle lens to capture a lot of details on the mountains and really bring out the ruggedness.

That was enough to break through the “prompt writer’s block”. I figured out the rest of the details quickly. I wanted the mountains to be sidelit, to have the rice terraces facing the sun, to have the mountains cast shadows on the fog inversion, to have a high horizon and a landscape aspect ratio (since the focus was on the landscape) and a perspective from up high. Here’s what I got:

Title: Cirque Jungle Tunnels (Made in Midjourney) https://www.instagram.com/p/CyNcBqarEzl/

To make my “almost blank canvas”, I prompted: details of Transantarctic_Mountains sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — no snow, ice, icecap, glaciers — s 50 — style raw — ar 16:9 (if you don’t know about underscores in Midjourney, then I highly suggest reading this post)
And I selected the below:

Midjourney had done an excellent job of stripping off the snow while keeping the mountains nice and sharp. I found that they looked less impressive if I put extra details into the prompt.

I would have inpainted in a cloud inversion next, but fortuitously this picture already had one, as did a few others that I didn’t select.

So now that I had the “bare bones” of my image, it was time to add the vegetation. In Vary (Region) I selected a portion of the mountains, but was careful to leave enough of the edges unselected that the bot would remake the jagged mountains when inpainting (a principle I discovered when making this piece). Anyway I typed: wild subtropical_cloudforest with tree_ferns and palm_trees on Transantarctic_Mountains sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — no snow, ice, icecap, glaciers — s 50 — style raw — ar 16:9
And got:

There were a lot of bare slopes, so I selected those for inpainting with the same prompt, getting:

Some of the fog on the left was a bit too thin for my liking, but that was where the sun was coming in from, and where I planned to put in my rice terraces. To get them I used the trick of generating a separate image to add to the inpainting prompt (first discussed here). The prompt for this was: rice_terraces on Transantarctic_Mountains towering out of cloud_inversion sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — no snow, ice, icecap, glaciers — s 50 — style raw and kept this one:

Then going back to my main image, I clicked Vary (Region), selected the mountain slopes to the left of the main ridge going up the centre of the page, and prompted: <URL of Image of Rice Terraces>:: rice_terraces on the Transantarctic_Mountains towering out of a cloud_inversion sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — no snow, ice, icecap, glaciers — s 50 — style raw — ar 16:9
I selected this image from the results:

The next stage was the underground city. I wanted it to have the rice terraces both above it and below it, and I knew Midjourney would need guidance with an image prompt, and not just any image prompt, but a specially crafted one. The first stage of generating this was to prompt: cavates in a cliff_face like Uplistsikhe or the Anasazi cliff_dwellings, sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — ar 16:9 — s 50 — style raw — no railings

I kept this image:

In order to make the image integrate smoothly into my main piece, I wanted rice terraces above and below the cliff. I got these by going into Vary (Region), selecting the land above and below the cliff, and prompting: rice_terraces, sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — s 50 — style raw — no railings — ar 16:9

I did two iterations (like what I’d done to inpaint the forest onto the mountains), and ended up with:

Going back to my main image, I noticed that there was a cliff separating two areas of rice terraces.

I selected an area around this for inpainting, and prompted: <URL of Above Image of Cliff City>:: cavates in a cliff_face like Uplistsikhe or the Anasazi cliff_dwellings, sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — s 50 — style raw — no railings — ar 16:9
This yielded:

From here I did a few small inpaints to touch it up, which went well except for one problem: I couldn’t make the underground city nicer. Inpainting always either left it as it was, or erased the city entirely.

Now if you look at all my prompts, you can see that I set the — s parameter to 50. If you haven’t read Midjourney’s explanation of it, then I highly recommend you do so here. But I settled on a low value by default, because I feel like Midjourney takes too many liberties otherwise.

However, for this piece, I decided to see what would happen if I used a high — s value in inpainting. I had a hunch that, even if I did this, Midjourney would keep the inpainted portion similar to the surrounding image (painted with a low — s parameter).

I selected parts of the underground city entrances in Vary (Region) and prompted: limestone_columns, cavates in a cliff_face like Uplistsikhe or the Anasazi cliff_dwellings, sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — s 750 — style raw — no railings — ar 16:9

Success! The inpainting had made big changes to all 4 images.

With a bit of tidying up I had the below:

It was finally time for stage #5 of the procedure I outlined above. Now that I’d plonked in all the objects that could cast a shadow, I could inpaint in the shadows they would cast onto the fog layer. It wouldn’t have worked to make the shadows first, because otherwise they wouldn’t match what was casting them.

I selected an area of the cloud inversion including where I thought shadows would fall, and prompted: Transantarctic_mountains casting shadows on a cloud_inversion of dense_fog, sidelit by low_angle_sunlight, wide_angle aerial_photograph by Sony A7R II with high_horizon and desaturated_cold_tones — s 750 — style raw — no snow, ice, glacier, icecap

I liked this image:

Except for one thing: The shadows created an area of trapped space on the right.

I solved this by clicking to Pan (Right) with the same prompt, getting this super wiide image with a 32:9 aspect ratio!

(you might worry about losing resolution, but Midjourney keeps the original resolution with the Pan function, unlike with the Zoom functions)

I then cropped it down to a 16:9 aspect ratio, basically keeping everything on the left, but sliding it over just enough that the trapped space now connected to the rest of the unshadowed fog.

And here’s the final product I created with this method. Good luck with it if you give it a try.

The 5-Step Midjourney Method (and the inpainting hack I stumbled on while using it)

Written by Alex Tully