This blog post was originally written in July 2019.
I love working with musicians helping them to create visual counterpart to their music. In the past, I used to work with black metal genre a lot, which is using a very distinctive dark aesthetics. The black & white grainy low-quality imagery that the Norwegian black metal bands used on their album covers in the 90s was corresponding with the specific raw sound of their music and could be achieved by reusing the analogue technique over and over again. Working with digital tools nowadays, trying to achieve this washed out low-quality imagery feels like a fraud. I don’t see any point in faking grain and scratches in Photoshop, nor lowering the quality of an image on purpose just to make it look raw.
What I would usually do instead is embrace hand-made footage and drawing and using a lot of time-consuming analogue techniques. However, one must ask, does it make any sense? One of the reasons black metal aesthetic looks the way it did is the careless approach — just spontaneously doing something raw and nasty, and especially not overthinking. Experimenting with GANs and talking about the creative potential of AI tools lately, I had to ask — can I generate black metal visuals instead?
I had in mind using a footage of nature, something that could resemble Norwegian forests, and turning it in a black and white, and possibly distort it as much as possible.
I started with a nature scene/landscape video, which I ran through the DeepLab model in Runway ML. This model created a semantic map for each frame of the video. Something SPADE could later make sense of and generate nature sceneries based on the map. Running DeepLab model was chained directly to SPADE-Landscapes model, which took frames from DeepLab and turned them into generated images. These were instantly saved in an image directory in this case — but the model could have also been chained directly to the third model I used — the Arbitrary-Image-Stylization. The last step was the ESRGAN model, which upscales the images by 4x, which gave me quite acceptable resolution.
So I got the generated nature sceneries, but I wanted them to look a bit more blackmetal-ish. I tried several images and settings in the style-transfer phase.
After trying several black & white misty forests and getting not really satisfying results (the style transfer was still pretty decent), I wanted to try more extreme inputs. Darkthrone’s famous Transylvanian Hunger album cover did already some nice changes to nature there:
When I noticed the style-transfer is able to grab the most iconic part of the visual and re-interpret the nature with it, I got immediately curious what it would do with just simple classic black & white black metal logo. There it goes:
Amazing! Now the nature is all black and white and all shapes are represented with the branch-like structure of black metal logos!
To wrap this up — I think there’s definitely a future in generating visuals in artistic practice. Searching for the right footage to work with used to be the biggest pain: there’s always a problem with licenses of images, plus it always took hours of googling/browsing databases or creating your own material.
There’s often prejudice against generated visuals because it usually was impossible to really direct the final output. Such artwork had to often count on some level of randomness and amount of “happy little accidents” (as Bob Ross used to say). That might not be the case anymore. During this process, I had quite a clear idea of what I wanted to achieve, and by combining various models, I was able to get to that point quite fast. Yes, it will still take some time for the workflow and quality of outputs to reach the expectations. But the state of the art as it is now is definitely a tip of a really colossal iceberg, creeping somewhere under the dark waters of a whole fjord of possibilities!