The Art of AI: Using ChatGPT and Midjourney to Generate Inspiring Visuals

Exploring the beauty of randomness.

5 min readMay 16, 2023

This image was generated by Midjourney, powered by ChatGPT prompt skills.

In this series of articles, I will showcase how ChatGPT and Midjourney can be integrated to automate the creation of stunning art. We’ll start by discussing the capabilities of different image generators. Next, we will explore how to automate the process of art generation, deploy the API to the cloud, and present the art on a Raspberry Pi with an E-ink display. This is the first article in the series.

First things first. Let’s start with the biggest clickbait of 2023, ChatGPT.

ChatGPT

Using the OpenAI API is fairly straightforward. Here is a NodeJs example:

const { Configuration, OpenAIApi } = require("openai");
const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
const completion = await openai.createChatCompletion({
  model: "gpt-3.5-turbo",
  messages: [{role: "user", content: "Hello world"}],
});
console.log(completion.data.choices[0].message);

There is also an alternative to OpenAI, called Google’s Bard, its API was set to be released in May, but it is still not available as of now. I’ve tested both, and they perform similarly. For this particular task, we will use the GPT-3.5 model, which I will refer to as ChatGPT.

There are an infinite number of ways to generate prompts in ChatGPT. To start, I asked ChatGPT to generate a quote from a famous person, provide the name of the person, and give me a description of a painting or situation that fits the quote.

Midjourney vs Stable Diffusion vs DALL-E

Let’s explore how different image generators respond to the prompts generated by ChatGPT.

“Believe in yourself and all that you are. Know that there is something inside you that is greater than any obstacle.” — Christian D. Larson
Painting: A runner stands at the starting line of a race, surrounded by other athletes. In the background, a looming obstacle course can be seen, complete with hurdles, walls, and other challenges. Despite the daunting task ahead, the runner looks confident and determined, knowing that they have the skills and inner strength to overcome any obstacle. The painting perfectly captures Larson’s message, reminding us that when we believe in ourselves and our abilities, there is nothing that can stand in our way.

Now let’s try a different prompt:

https://media.discordapp.net/attachments/1095281132133679177/1102312836505550908/alekslario_a_girl_standing_in_the_rain_dirty_cloth_and_homeless_37d16c9b-cdc6-4ebc-bd84-b6fecf75272e.png?width=405&height=405 Painting: A runner stands at the starting line of a race, surrounded by other athletes. In the background, a looming obstacle course can be seen, complete with hurdles, walls, and other challenges. Despite the daunting task ahead, the runner looks confident and determined, knowing that they have the skills and inner strength to overcome any obstacle. The painting perfectly captures Larson’s message, reminding us that when we believe in ourselves and our abilities, there is nothing that can stand in our way. Chiaroscuro and van gogh style, high contrast, painting classic, oil , bows, velvet vogue, dark shadows, iridescent, holographic, full body pose, transparent details, colorized, neon pastel color palette, photorealistic, highly detailed, transparent parts, looks like professional hdr photography in fashion magazine, editorial photography, global illumination, moody lighting, post-processing, glowing shadows, chromatic aberration, black and white balance, scattering, beautiful cinematic lighting, lumen reflections, post-production, high fashion, tone mapping, shaders, diffraction grading 3d, octane render, detailed, hyperrealistic, 16K

We concatenated additional text/keywords to influence the style of the image. Notice the URL at the start (it is the header image of this post). Both Stable Diffusion and Midjourney accept images in prompts. DALL-E does not.

An advanced guide to writing prompts for Midjourney ( text-to-image)

A detailed ‘cheat sheet’ and some keywords for improving image output by using better prompts

medium.com

Midjourney is clearly better, though I have to say that both Stable Diffusion and DALL-E can produce comparable pictures with the right prompt. Stable Diffusion has many options and adjustments, and its baseline is closer to the prompt. Both Stable Diffusion and DALL-E gave us what we asked for : a runner. DALL-E also added hurdles that were mentioned in the prompt. On the other hand, Midjourney’s baseline is more artistic, which is why we will use it for this project.

Midjourney offers a subscription model, but there is a free trial available.

However, there are some downsides to using the free trial. For example, the generation speed may be slower than with a paid subscription. Additionally, users can only use public channels where other users can see their prompts and results. These public channels can become filled quickly, which can make it difficult to use them effectively. Going forward, I will assume you have a subscription, but it’s not that hard to adapt the code to the free trial either.

Lastly, Midjourney does not have an API. They operate solely through Discord, and the only way to automate the process is through scraping.

I will cover how to accomplish this in part 2 of the series, which can be found here.

How to Automate Midjourney Image Generation with Puppeteer

Web scraping Discord