Play “Hovering Art Director” with “Artificial Interns”

Experimenting with AI-powered image tools Lexica and Runway ML’s Custom Generator

Jayne Vidheecharoen
5 min readMar 11, 2023
Generated with the prompt “cut paper isometric illustrations of internet artificial intelligence network pastel colors.”

There’s been a lot of chatter lately about AI tools¹, which produce both mind-blowing results, plenty of controversies, and unsettling conversations². After playing around with a few of the AI creation tools out there, a few observations came to mind:

  1. Using “Artificial Intelligence” is more like having your own “Artificial Interns.”
  2. AI tools often generate things that, on the surface, “look good” initially but still seem to get important details wrong.
  3. The models still don’t quite understand my style, but maybe I just haven’t learned how to “mentor” them effectively yet.

Artificial Interns

Getting a good output depends on giving the model a well-crafted prompt and example images, similar to how you might need to create a very well-defined creative brief for an intern or junior designer.

And AI is good at the type of work that might have been delegated to an intern in the past: proofreading, summarizing documents, transcribing a recording, removing the background of a video, cleaning data, generating lots of iterations, etc.

But it requires you to be a bit of a hovering art director and still needs a lot of tweaking and guidance to get what you’re looking for.

Style is what you get wrong.

But sometimes, the model randomly produces happy accidents —or at least some funny ones.

Perhaps, as a tool, AI’s strength is offering an enormous “accident space”: a dimension of plausible, potential variations on an intention. — David OReilly

At a glance, the images I generated below using Lexica are pretty impressive, but when you look at them closer, there are funky details. For instance, the face looks pretty good, but the model doesn’t quite understand hands and keyboards or how much physical space things should take up.

Generated with the prompt “anime girl at the computer listening to music cute aesthetic pastel colors.” I’ve noticed many of the images it generates have this glossy and glowy aesthetic.

And since I’m always curious how tools could be used in planning, here are a few examples of what the model thinks a “walkable pedestrian-oriented downtown corridor with bike lanes, wide sidewalks, shade trees, and a variety of storefronts” might look like.

Again it looks pretty nice at a glance, but upon closer inspection, it clearly doesn’t understand how people ride bikes, how crosswalks work, or the difference between trees and lamp posts.

Generated with the prompt “walkable pedestrian-oriented downtown corridor with bike lanes, wide sidewalks, shade trees, and a variety of storefronts.”

I found this little explanation of what chatGPT is doing and how it works interesting:

if we always pick the highest-ranked word, we’ll typically get a very “flat” essay, that never seems to “show any creativity” (and even sometimes repeats word for word). But if sometimes (at random) we pick lower-ranked words, we get a “more interesting” essay.

Essentially to make the model’s output more creative, it has to be a little bit less perfect. So the tricky part is crafting the imperfections. It reminded me of this quote from Neil Gaiman:

Style is what you get wrong, that makes what you do sound like you. Style is what you can’t help doing. Style is what you’re left with.

In the style of Jayne’s Comics

Runway ML has a feature where you can train your own custom models for $10, so I used my comics as the training material. I uploaded 50 of my comics and pushed a button to train. After about 30 minutes, it was ready to go. Then, I typed in a prompt with the custom keyword and got the images back:

Sort of worked?

It’s not coherent, but it almost makes sense. And to be fair, I don’t think this is the use case it’s designed for. It seems to work better for things like photos, objects, portraits, digital painting styles, etc. I think better “mentorship” could make the output more legible.

But in some ways, it’s reassuring that I’m not completely replaceable by a simple image generation model…. yet. 😅

As a reminder, you cannot be commodified by machine, because you are a breathing physical being of unimaginable complexity. You exist in a circumstance that has never occurred before, that you can make anything of, and you are changing in every way — and so is the world. And you will die, and so will it.

The marks you make during your life are there to help others along before that happens. The overlooked beauty and unspeakable horrors of life will always need description through art, and this will have to be carried out by individuals, like you. — David OReilly

Notes

  1. I was going to include a list of excellent AI tools for creators, but Descript already put together a great list, so check this out instead: The ultimate list of AI tools for creators.
  2. This NYT transcript of the unhinged Bing chatbot is amazing and a little unsettling: Bing’s A.I. Chat: ‘I Want to Be Alive.

--

--