Creativity Is Probably Just a Complex Mix of Generative Art Algorithms
While it remains a theory, Marvin Minsky’s concept of a “Society of Minds” makes demonstrable sense in the realm of artificial creativity. Minsky, the co-founder of MIT’s Artificial Intelligence Lab, proposed that our minds are not a single super intelligence, but rather a collection of smaller intelligences that surface as needed and compete with each other to solve problems.
Making artistic paintings with my robots is a problem just like any other, and Minsky’s approach has proven remarkably effective at solving it. In fact, the more paintings I create with this approach, the more I am beginning to realize that artistic creativity is little more than a complex mix of competing generative art algorithms.
While some AI artists write increasingly sophisticated algorithms, I have found that quantity is better than quality. The painting in the following timelapse was created between June 2nd and 6th, 2018 with more than twenty-six distinct generative algorithms.
On a micro level, it shows 13,396 individual brush strokes, each made by a low level aesthetic decision. But the robot was also making multiple mid level decisions throughout, hundreds of them. Furthermore, it made more than four high level aesthetic decisions that sometimes changed the direction of the artwork entirely.
To be clear and avoid any confusion about the level of autonomy of my robots as they painted this piece, I would like to emphasize that while they can paint completely autonomously and often do, I served as the art director for this painting. While the robot’s algorithms made all of the aesthetic decisions independently, I did curate a small number of those decisions and even wrote some new algorithms for it to use when I felt the robot needed a little artistic help. Furthermore, I also served as an assistant by cleaning spills, mixing paints, and moving the canvas between my various painting robots. With the understanding that this was a collaboration with my robots in the lead, the following is a step-by-step description of what was happening in their “Society of Minds” as we created this piece of art together.
To begin this description, it is important to specify that this piece was a commission for GumGum defined by some general guidelines. The first thing provided was a general color palette which consisted of GumGum’s corporate colors. I was also given five artists as reference and asked if my robots could use them as inspiration for an original 18"x24" canvas in their colors. Below are images of artwork from each artist who were (listed in order) Lee Krasner, Willem de Kooning, Elaine de Kooning, Georgia O’Keeffe, and Michael West.
I didn’t know where to start with these images so I just started running algorithms my robot already knew how the execute. One of the first was to create a Style Transfer Grid to mix and match the various colors, textures, and styles of the reference paintings. After several hours of deep learning number crunching I was presented with the following grid.
The top left hand corner represents the palette provided by GumGum, and the columns and rows are the five artworks rendered in the style of each other (with echoes of the color palette).
I didn’t know what to do with this yet, but I did notice that some style transfers worked better at matching the color palette than others. It was just my artistic intuition, but I wondered if I could write an algorithm that mimicked my personal taste. So I wrote a quick (experimental) new algorithm that measured how well each source image lined up with the color palette. It began with a K-means Clustering algorithm where I made nine buckets and that would group pixels in each imageby their R, G, B, X, and Y values. I then ran the algorithm on the palette and source images and compared the color values of each bucket. After some tuning, the algorithm found that the two paintings that matched the palette best were the Willem de Kooning and the Georgia O’Keefe. As I said, I did have to tune the color matching to get the same results of my intuition, but that is why I consider this an experimental algorithm. Only further testing will reveal if it is actually an effective artificially creative algorithm, or if I overfit it to this small example. But at this point I had a new algorithm to add to my arsenal that I will call K-Means Clustering Color Matching. It is nothing fancy, but it did make the programmatic aesthetic decision that the following two source images were the best match to the palette provided by GumGum. This is information that my robots will store for later use.
Up until now, only style and color have been considered. Content has not even been looked at. Lots of ways to get original content, but the easiest way is to do what my robots have been doing for years, have a photoshoot.
Dozens of pictures were taken and my robots used the Viola-Jones Facial Detection algorithm to zero in on a face. Once it found a face, it then employed a Portrait Rating algorithm that considered sharpness, contrast, composition, and general aesthetic balance to pick a handful of favorites. Perhaps one of my favorite rating agents was devised by my son who realized that photos with Symmetrical Eyes were often the most attractive. While it is not the only measure, symmetrical eyes add significantly to the rating which can be seen in bottom left of each photo. Another simple algorithm run by the robots is a Portrait Cropping algorithm that attempts to provide an even balanced composition depending on position of the eyes and mouth.
This is all fuzzy logic, however, and art begins with sketches. So to decide on what to start painting, my robots randomly selected a highly rated photo and one of the reference paintings. It then combined everything with another style transfer to decide upon its first of dozens of Trace Images.
The trace image is what the robot holds in its memory as what it is trying to paint. If this were a printer, it would be a rasterized image with a set of instructions of how to render the raster with pixels. This is not a printer, however, and this is where my favorite algorithm of all kicks in, Feedback Loops and something I call a Difference Map.
Before painting, the robot begins by taking a photo of the canvas, which of course is a large white area. It then creates a difference map, which is a heatmap calculated by how different the canvas is from the trace image. Reds are areas that need to be darkened, and blues are areas that need to be lightened. The robot then decides on the color to use, and where it can be applied to maximize the reduction of the difference in the heatmap. A comparison of how the robot sees the heatmap can be seen in the image to the left that has completed the first 38 iterations of the feedback loop. In the beginning there were only areas that needed to be darkened (red), but after a couple hundred strokes some areas became too dark (light blue) and now need to be lightened.
As can be seen in this animation, this algorithm is run thousands of times from the start to finish of each painting slowly optimizing the canvas, bringing it closer and closer to the trace image which each brush stroke. This is close to my artistic process when I decide to paint a portrait. Like my robot, I try to reduce the differences between the painting and the subject I am trying to paint until likeness is achieved. Is this how human artists work, at least partially?
My heatmap color difference reduction algorithm is not the only thing that decides where and how to apply the next stroke. It gets turned on and off throughout the creation of a painting as other algorithms take over and control the brush. Furthermore, it operates in several different modes of varying complexity. Remember that this whole operation is a “Society of Minds,” where a whole bunch of algorithms are fighting for control and taking turns solving aesthetic tasks.
One of the simpler alternatives that kicks on once in a while is something I call a Paintmap. This is as dumb as an AI algorithm can get, and operates like a printer. It looks at the colors it has to work with and reduces the color palette of the trace image to that subset of colors. It then just starts applying the colors from the palette to those areas. Artists sometimes do a quick underpainting or cartoon to begin a piece of art. This is a quick and effective way to complete an underpainting. In its role in the “Society of Minds,” paintmaps have a higher chance of activating at the very beginning of a painting, and become less active as the image progresses.
Up until now I have talked about what the robot intends to paint with its trace and paintmap images, but I haven’t really touched on how it decides to apply the brushstrokes. This one will be harder to explain simply because there is so much variety and the complexity varies from the remarkably simple to confusingly complex.
The simplest algorithms just follow a simple pattern. They look at the paintmap and execute one of a dozen or so preprogrammed paths that I have designed. I have given these pattern simple descriptive names such as Checkers, Rain, Barcodes, and Stripes. All they do is drag the brush in straight lines following the pattern. In barcodes for example, the brush was only dragged in completely vertical lines. The paintmap example I showed previously looks like it was using stripes for at least part of its execution.
In this other painting created ten years ago by my third robot, you can see my favorite pattern that I call Rain. In it lines are only painted either vertically or at 45 degree angles. In addition to these algorithms that I have written, I sometimes tap into other path planning procedures found in open source software packages such as OpenCV and Processing. There are simply so many to choose from.
If a painting were entirely straight lines, it has the potential to look cool, but would not be as impressive as one where the strokes automatically followed the contours of the shapes in the paintings. Here is where Hough Lines can be useful.
As the robot plans the paths for its strokes across the trace image, one of the things it is constantly doing is search for and follow lines in the image. One the left are some examples of hough lines (in red) as applied to the trace image at various stages of the painting. These lines are stored in the robots memory and when it decides where to apply color, it searches for the closest line and uses its geometry to decide the direction and length of the stroke. This produced brushstrokes that follow the shape and contour of the image being painted. When using hough lines many dynamic settings have to be tuned for each image depending on the image itself. Several sub algorithms are used such as a Hough Line Tuner that tests different settings and selects the one that produces the most lines. Another algorithm called the Stroke Combiner makes longer more complex strokes from all the smaller hough lines.
The aforementioned path planning algorithms are completely automated. There are other algorithms, however, that use AI to imitate human strokes. I have a Brushstroke Database with complete records of every stroke for hundreds of previously completed paintings. The robot can reference the strokes in this database and draw upon their geometry and apply them to new paintings.
Furthermore, the robots have an Autocomplete Mode. In autocomplete mode, any number of human artists can use a touchscreen tablet paint along with the robot. The robot will watch how the humans are painting and then attempt to autocomplete the rest of the painting imitating the human stroke patterns. In this commission for GumGum, I occasionally took over and painted with a tablet and the robot’s AI studied then followed my lead. I try to keep my involvement minimal with less than a hundred strokes throughout any given painting. While these strokes are low in number, they are highly effective strokes that are an important part of the process for multiple reasons. Their most straight forward utility is that they are a way for me to quickly repair errors by the painting robot. The second is that it is a way for me to teach the AI when it just isn’t doing as good a job as I want it to. These strokes provide valuable labeled truth data that can be used later in deep learning training. By Training Deep Learning Neural Networks on labeled human strokes and autonomous strokes, the robots are being taught to paint less like a machine and more like an artist. Every stroke I give it is analyzed and internalized. This is a highly experimental algorithm that I am still working on. But before this can be solved I need to collect the data. The deep learning will not actually work until there is enough data, and collecting the data is the first step.
Up until now many technical algorithms have been described, but few of the artistic ones. Considering that artistry is the most mysterious of artificially creative algorithms, the biggest question everyone should have right now is how this painting went from a photoshoot to the finished piece. That is to say how did this become this?
This is a big leap. And to be perfectly honest, I am even fuzzy on the details because I only barely understand some of the algorithms that achieved the transformation. I wrote them, but with something like deep learning that doesn’t mean you know how they work. At its root though, are feedback loops. Like a human artist, the robot is making marks, stepping back to look at how those marks helped it get closer to its goal, then making more marks. The key to the variety, however, is that similar to human artist, the robot’s goals keep on changing.
Earlier I mentioned that it always worked from a trace image and that it was trying to reduce to difference between the canvas and the trace image. Throughout the entire painting process it is doing exactly this. The variety comes from the fact that the trace image it is chasing keeps on changing. Rewatch the twenty second time-lapse from earlier next to a dramatization of the trace images that it is attempting to paint. Notice how the robot is attempting to paint the image even as the image keeps changing into different compositions.
I mention this is a dramatization because the feedback loops go through thousands of intermediary paintmaps, trace maps, and difference maps. There would simply be too much data if I recorded every single one of these images over the course of the first 13,396 strokes. The periodic snapshot taken for the timelapse above is close to a Gig. If the robot recorded every single recalculated paintmap, difference map, and trace image through the course of the 13,396 strokes, each painting would require terabytes of data storage. While I do not have a record of every robotic decision, I do keep the time-lapse and the detailed geometry of each stroke in a database.
To be clear, the following images only represent the major shifts in the robot’s artistic goals. Many more shifts happened on a less dramatic level but they either didn’t surface or fully develop because the robot changed its artistic goals before finishing them. Also note that the stages described below are not as rigid as the animation suggest. The robot’s “Society of Minds” was much more fluid and gradually developed the painting over five days.
I have already shown you its first major decision and that was to apply a style transfer from one of the masterpieces of the highest rated portraits from the photoshoot. I always find it interesting to see how neural nets reimagine images on a step by step basis so here is brief animation of the process over 2000 iterations. The decision to do this was somewhat random, but only with regards to the content. The robot had two lists of content. One of the highest rated photographs and another of the source images provided by GumGum. It randomly selected one from each list and used them to apply deep learning style transfer and start painting.
While the first animation combined two random images from a curated list, the next stage was a more meaningful. This stage attempted to operate like an artist that paints a little then takes a step back to see the progress before painting a little more. In this step another style transfer is being applied, however, the content image it begins from is anything but random. The robot is applying a style transfer to the image it has already completed painting on the canvas. This is a true feedback loop where the robot is examining its progress in the real world and using an image of the canvas as input into its neural networks.
It was at this point that my robot dramatically reverted to one of my first AI algorithms. Years ago I was attempting to make my robot more creative and it occured to me that one of the simplest ways to achieve this was to give it a sense of Horror Vacui. If there is empty space, fill it with something.
These algorithms gave rise to paintings such as this early piece from my second painting robot. The early rules were simple. Randomly add content to areas where there was no content. For my early paintings the robot did not even consider the context or background. It just stamped an image onto a canvas over and over again until the entire background was filled. Then it started painting. Simple.
For my newer robots, however, I wanted it to have a little more intelligence about how it went about things. I didn’t want it to just be random composition creation. To achieve more meaningful creation of unique compositions I had my robots consider and line up the content that they could understand. In the case of how this painting progressed, I used an algorithm I am calling Neural Portraiture where I have neural networks look for eyes throughout an image. I then use the eyes as anchor points to create a warped mosaic of faces. Here is a diagram of showing how the algorithm makes these decisions and lines up the mosaic of images. This image is a dramatization of the process as the robot’s AI performed this step multiple times working through many solutions as the content gradually emerged on the canvas.
It is important to note that this step of the process is lightly curated by me. While the robot independently came up with and worked through multiple images, I would check on it every once in a while to throw out images that did not work or were too abstract. I am actively developing an Auto-Curation algorithm for this step, but it isn’t that good yet and remains a work in progress. This algorithm did get a little better as I tuned it for this painting, but it isn’t quite autonomous yet. Hopefully soon, I figure out something that works well to reliably decide when a neural portrait is complete.
At this point the painting underwent its final major artistic shift. The most interesting part about this is that this step was NOT curated by me. The transformation in the final stages of this painting, as dramatic as it was, was made completely by my robots and their neural networks. It was done with my own version of style transfer that I call a Style Mash-Up, which is basically a style transfer with one content image and two style images, so don’t be too impressed.
If you look back to the very early stages of this process, you will remember that I ran an algorithm that selected the source images that were most like the color palette that GumGum provided. Remember this de Kooning and O’Keeffe?
Here is what it looks like when my deep learning style mash-up tries to reimagine an image of this portrait by combining the de Kooning and O’Keeffe.
For the last couple days of this painting, my robots attempted to paint fluctuating combinations of these two styles allowing the portrait to slowly emerge.
How did my robots know when they were finished?
This is perhaps one of the most important decisions made by any artist, and most difficult to program because it often comes down to a matter of taste. Problem is my robots do not have taste so they had to rely on math. Their decision to finish was actually really simple and is something I call the I’ve Done My Best Algorithm. Remember that with every stroke an image is taken of the canvas and a heatmap featuring the difference between the canvas and trace image is calculated. Over the course of the painting this heatmap becomes lighter and lighter as it approaches minimal error. At some point the robot can no longer reduce the error, however, no matter how many additional brush strokes it adds. The total error simply stops going down. After the robot has been painting for a set period of time and notices that the painting isn’t becoming more like the trace image, it concludes that it has done its best and stops painting.
Understanding that my robots had done their best, it was then my turn to teach them how to be better. At the completion of their 13,396 strokes, I took over and cleaned up the image by applying the last couple dozen touch ups. This was done by tracing over the final trace image on a tablet. One of my robot arms would then apply these strokes to the canvas.
Some critics are concerned that I step in at various points in the artwork and help the robot out, but I think that such critiques miss the point of using AI to create art. Even students of art school have professors looking over their shoulders and occasionally helping out. That doesn’t invalidate their art. With my curation and interventions, I am teaching the robots to paint more like an artist.
I touched on this briefly earlier, but every brush stroke is recorded and used by the robot to analyze the difference between how it paints autonomously and how I paint. A comparison of these two types of brushstrokes will help my machines paint more like me and less like a robot.
Satisfied with our collaboration, I make the final decision that the artwork is complete. Interestingly, I am learning that this is the only decision that really matters when creating a piece of art.
Pindar Van Arman