How a Psychedelic AI Experiment Became a Video Artwork About Structural Racism
An Interview with the Creators of “Improvised Explosive Denied”
I saw faces growing inside faces, trees becoming people, landscapes melting like ice cream, and bodies turning into fruit. Even if this was just the effect of artificial intelligence and not drugs, it was still a head trip. I was watching a surreal video experiment—the brainchild of data scientist Ryan Cranfill—that used machine learning to produce a series of animated, Impressionist-style paintings that twitched, glitched and morphed to music.
Months before this I had written, “Improvised Explosive Denied,” (IED) a song inspired by a tangle of emotions spurred by relentless headlines of injustice and a mental health crisis. My world was being challenged and turned upside down and expressing weariness and confusion through music was my way to get through it. Unexpectedly this song led to another, then another and another. After six months I had a new record.
With new music comes the task of promotion, but this song was too special to me and promoting it didn’t feel authentic so I created videos for other songs instead. Then one day I came across Ryan’s experiments and wondered if he could use his digital psychedelics to animate the news imagery I was consuming daily to create a visual impression of these disorienting times. Ryan was game.
Now I am excited to share a visual complement to the song, nearly a year to the day that I began writing it. Improvised Explosive Denied is a generative painting that depicts Black American struggle through mass media imagery and acknowledges our country’s difficulty — and mine — to engage with it.
I wish I could say we planned this premiere date, but it’s just fortuitous. It took months of trial and error to find a process that could produce visuals with which Ryan and I were both happy. Initially I intended to contrast photojournalism with portraits of political oppressors and resistors but we ran into a problem. The AI wasn’t versed in black and white photographs with crowds, nor in people with dark skin, and so it had difficulty identifying my reference images. This was a deal breaker.
Then we found we could partially overcome these limitations by using Photoshop’s recently released Neural Filters to transfer paint styles, such as from Van Gogh’s “Starry Night,” to the source images. Though we still could not use the crowd photos, we could use portraits, transforming them into painting-like images the AI could better interpret. From there Ryan ran optimization code to help the network identify visual elements in our new reference images that had similarities to the AI’s visual training ground — Wikiart.org.
The results were new images created by the neural network attempting to mimic the filtered images. Since the network hadn’t been trained with images of athletes, for example, it could only produce an imperfect “painting” of something that looked Muhammad Ali-like with boxing gloves or Colin Kaepernick-like in a football jersey. I loved how this technical limitation mirrored the human struggle of grappling with new realities, trying to force information into assumptions.
Improvised Explosive Denied is a “living painting” that falls into the genre of generative art, a kind of art that is practiced by focusing on the creation of parameters and then letting the computer make the imagery. This computational approach was pioneered by Vera Molnar, among others, in the 1960’s. Because the computer is making choices about the order and transitions of the images, it is impossible to predict exactly what it will make. Ryan ran multiple versions of the video with varied sensitivities to the music resulting in a large variety of outcomes. I selected clips from among them and edited a final version of the video.
Choosing the clips was difficult simply because there were so many interesting options. The images evolved and responded to the music as they morphed into one another, making new, unexpected visuals in the process. In this video, though I chose photos of Black leaders, victims of police violence and protestors, you might also catch glimpses of the Queen, Jesus or George Washington — or at least something that looks like them, because they have references in the WikiArt library. The AI is bridging between the new images it can’t recognize by using the only frame of reference available to it. It’s like a time traveler from the past trying to make sense of the present, or maybe closer to home, a society viewing its current reality through old paradigms.
Ryan and I discussed this dynamic in further detail — and the racial biases in it — on a recent video call:
Michael Hendrix: I’ll just start with what I see. What is the technology doing? What I’m seeing in the videos is new imagery that we didn’t choose. The network is throwing things in. For example, I might see a woman in a cape with a bow around her neck, which clearly is not some of our reference imagery, but for whatever reason it’s recognizing some shapes and making that image. Is that because it was taught on a portrait of a woman with a bow?
Ryan Cranfill: Yeah, so, I think you’re hitting on an important point. It’s kind of like that saying “when the only tool you have is a hammer, everything looks like a nail.” These GANs, these algorithms, are really good at creating new versions of things they’ve already seen, but they are not great at generalizing to new types of inputs. So when you see a woman with a cape or a parasol or something like that it’s likely because it has seen a number of examples that look like that. And so it’s creating in that big wide open space of all the paintings that it’s seen. It’s not categorizing them and it’s not reproducing them exactly, but it’s just saying, over here in this space, maybe there are women with capes or something like that. But over here is something more like landscapes or cityscapes.
Ryan: There’s a really interesting discussion to be had about the biases that we encode in these things because this AI was trained on paintings from WikiArt, and believe it or not, WikiArt is vastly over represented with paintings by Western artists! A lot of those subjects tend to be portraits of white dudes and I think it’s interesting that when we throw portraits of Black people in there it struggles because the software hasn’t had much to draw inspiration from. The biases are encoded by the fact that there’s not a ton of representation of those people in the source material.
Michael: Yeah, in many ways, it becomes a metaphor for us today as we’re struggling with this idea of structural racism. We’re actually seeing it play out in this technology (WikiArt) that began as an empty platform. Someone said “add art to this,” and the people who added art chose from established institutions and canons where there was already bias. The process just keeps creating the conditions to make it more and more confusing when you try to introduce Black figures into the system. You can literally see the struggle happen on the screen. The software doesn’t know what to do.
Ryan: Yeah, yeah. That’s why the imagery is so messed up. It’s why we were having issues making satisfying visuals for so long. These AIs can produce things that are similar to what they have seen before and it hadn’t seen photographs before. So it couldn’t do a great job of mimicking them, but it had a noticeably easier time finding something that looked like white figures more than it did of Black figures. Once you started to shift the colors of skin tones by adding the neural filters it had a much easier time of getting to something recognizable.
Michael: A lot of my choices replaced realistic skin tone. I made them more like Fauvist paintings, making a blue face or red face and it helped the software identify the composition. Now it sees something red that kind of looks like a face and it gives us something that is an impression of a face.
Ryan: I wonder if it’s recognizing the face or if it’s replicating it more like a landscape or a “Starry Night” with more vivid colors rather than as a portrait? If this (gestures) is the portrait zone and this is the landscape zone and we could find where those representations lived, I would think that they were a little closer to the landscape zone.
Michael: That makes sense. I was just thinking about my process of choosing a style transfer to go from a painting to a photo. In Photoshop you can either take the color from the style transfer or you can say “preserve the color” and just bring the brush marks. More often than not, I didn’t preserve color, but in the cases that I did it’s more figurative. The one that blows my mind is the recognizability of the Obama portrait. It comes through very clearly in every version. I think it’s because I preserved the color of his lighter skin. If I had made Obama blue I don’t think it would have come through the same way.
Ryan: I would suspect we would have similar issues if we used some of the models that make deep fake portraits. That’s also StyleGAN and It’s trained on a data set of high quality images of people’s faces gathered from Flickr. I think we might have similar issues when we try to find generated portraits of Black folks in that latent space because the data set that it was trained on probably didn’t have very many folks of color in it. Like you said, it’s interesting that this thing is struggling to break out of its own bias zone. It’s a nice metaphor for where we’re at right now.
Michael: Yeah, you can feel it. It’s what we’re collectively experiencing, either trying to be seen, or trying to see. Right now we’re in this transition period where we’re reckoning with that.
There is another thing going on here, the process of me, a white guy choosing mass media images that portray Black athletes, Black victims of police brutality, and Black community leaders. This is something I’m trying to reckon with too. When we started this project the brief was to show social unrest due to injustice and people that overcome societal barriers despite them. But because StyleGAN struggled with complex imagery it wasn’t a good brief. So then we ended up pursuing a different direction with another kind of reckoning: white people can tell ourselves that Black Americans have succeeded because of LeBron James, because of Beyonce and Kendrick Lamar… even because of Jesse Owens. It’s not a full truth though because here’s Breonna Taylor and here’s George Floyd and here’s Ahmad Aubrey and here’e EJ Bradford, Jr. This is another layer that I haven’t gotten particularly articulate about yet, but it’s at play here.
Ryan: Building on that, there’s also a parallel to our GAN. It only knows how to make white people because it’s mostly only seen white people. These are the mass media images that spring to mind when we think about the Black experience over the past 50 years because these are cherry picked stories that we’ve heard. But there are so many millions of stories that we’re just not exposed to in the same way. Like the systemic filter that gets applied to the GAN, there’s also a systemic filter applied to what the majority of society knows and the stories that we have in our heads.
Michael: That’s just so true. We can only see through our own experience, and if we don’t have interactions with people that are different than us our only frame of reference is mass media, which has some truth but it is not a full picture at all. It is so crazy how the GAN model is essentially just mirroring society.
Ryan: Yeah, it’s reflecting back to us what it has seen from us.