Asking Computers to Draw Pictures
But is it art? Also a brief explanation of how to use Midjourney
Generative AI has taken the world by storm. The hype cycle is in full effect and while I’m more than a little sceptical of the claims of world-changing long-term benefits or the panic about AI killing all humans, it looks like we’re going to be cresting that wave for a while yet.
Text generation tools like ChatGPT and Bard have (temporarily?) revolutionised how students write essays, how teachers write case studies and how the general population makes reports read ‘more professional’. Indeed, the temptation to run this article through ChatGPT to ‘tidy it up’ is quite strong despite my own default-man confidence in my writing (sincere thanks to the entirely human Jenn for editing this). I have been critical of the lack of second-order thinking in Large Language Models in the past but whatever your view on the creative limits or the ethics of these tools it is hard to deny that they are interesting and fun and truly useful in controlled situations.
For me, generative image generation is the peak of fun and interesting. It might also be the most ethically problematic with copyright issues and lawsuits popping up all over the place but, as a person with limited artistic ability, it’s also the most mind-blowing. I can ask for a painting, a drawing, a photograph, or a 3d render of almost anything my most outside brain can conjure up and a computer will do a remarkable job of interpreting my request and creating something suitable.
Having tried a handful of tools it is clear that Midjourney is the best in class of publicly available AI image generators. Its output consistently passes the ‘oh wow’ test, its hands most often have four fingers and a thumb, its arms and legs usually only bend at the appropriate joints. I’ve used Midjourney to generate headline pictures for several articles I’ve written, as a way of quickly storyboarding ideas, and for generating images for a lot of jokes. To be honest, it’s mostly jokes. Watching the stream of Midjourney output go by I’m struck by how serious most requests seem to be and how few of them are clearly for presentation slides. I’ve often wondered how many ChatGPT prompts are people generating content for presentations on AI where the twist is that ChatGPT has generated all the content. I guess Midjourney users are more focused on true art.
This brings me to my last point. I think part of the reason Midjourney’s output is so consistently incredible is because most of us don’t look at things properly. I surprised myself recently by spending considerably longer in an art gallery than I ever have before. I suddenly found myself really enjoying just looking at things in a really considered way. I saw things I wouldn’t usually see. For someone with a generally low attention span this was a total revelation. I read all the little plaques and I really enjoyed it. Imagine a gallery where all the plaques just had the image generation prompts on them instead of the back stories of the artists or tales about why they had created this piece.
I once worked with a digital artist who couldn’t understand why Aardman were still making clay models to film Wallace and Gromit. “What a waste of time” he could mutter under his breath as people enthused about The Wrong Trousers, “they could have just rendered this and it would have taken a tenth of the time.” At its simplest, image generation works by generating random noise until that noise begins to look like it could be an interpretation of the prompt. As far as I am aware, few human artists work like this — splodging random lines on a page until it begins to look like a vase of sunflowers or a girl with a pearl earring or a fantastical glade with high contrast lighting. Art is the output of an artist. Although, I guess, when your output is built on the lives, experiences, backgrounds and souls of a billion existing artists maybe that in and of itself is art?
How can you use Midjourney?
You can generate images with Midjourney in a limited way for free. There are some things you need to know though.
Access to Midjourney is entirely through Discord. I’ll be honest, when you go through the process of signing up to Midjourney it doesn’t feel like you’re doing something legit. You will need a Discord account, then you use the Midjourney website to get an invite to their server. Once you’re in it’s absolutely overwhelming. Discord does both text and voice chat and there are so many channels you can participate in in the Midjourney server.
When you are ready to generate an image you need to hop into one of the General Image Gen channels. When you join the channel you will see countless requests and images race past. This highlights an extremely important thing to remember: unless you are willing to pay $60 a month all your requests are going to be publicly visible to everyone else on the server.
In order to make your own request you type /imagine and then the thing that you want to see. There are so many resources on how to write “good” prompts on the internet but Midjourney is really good at inferring what you mean so you generally will see great results without adding lots of info about art style, camera type or influencing artists. Because so much is public you can also get great tips by just looking at what prompts other users have used.
Once you’ve sent your request you wait while the Midjourney Bot generates your images. The bot generates four images at a time. You may well be stuck in a queue that takes quite a long time to process so you can leave and come back later if you want to. Paying for a subscription gives you faster renders so if you’re impatient you can throw money at the problem. The bot will tag you when it generates your images so you’ll be able to look back and see what it has dreamt up. You’ll also see four buttons labelled U1–4 and four labelled V1–4. These are for upscaling or generating variations on the 1–4 images it has generated. Upscaling will give you a bigger, higher-resolution version. Variations will take that image and generate four slightly different versions of it.
And that’s it. Once you’re over the initial hump of getting signed up it’s actually pretty easy to use. Please let us know what you create!
This was originally written for the Waterstons Innovation Substack. You can subscribe for free here!