Machine Learning & Improvisation
--
The following post is a redux of my part of a recent panel I did with the physicist Stephon Alexander and the sculptor Saint Clair Cemin. Thanks to Stephon and Saint Clair for their brilliant ideas and guidance.
“No leaf ever wholly equals another, and the concept “leaf” is formed through an arbitrary abstraction from these individual differences, through forgetting the distinctions…”
- Frederich Nietzsche, “On Truth and Lie in an Extra-Moral Sense”
This quote raises the question:
When information is lost, what is gained?
To answer this question, let’s look at how an Artificial Neural Network (a kind of machine learning system) learns a concept such as “leaf.”
A Restricted Boltzmann Machine (RBM) is a kind of Artificial Neural Network, a mathematical model that to some extent imitates behaviors we can observe in biological neurons.
An RBM observes real-world patterns and tries to create a lower-dimensional representation of those patterns.
We can also stack multiple RBMs on top of one another to reduce the dimensionality of the patterns even further. One common architecture for stacked RBMs is called a Deep Belief Network (DBN).
In our case, we will try to learn about the concept of a “leaf” by looking at images of leaves. So the patterns here will be patterns within the pixels that constitute images of leaves.
To train (or teach) the RBM about leaves, we show it many images. In the animation below, we will train a DBN on leaf images from 100 species, using 16 images per species.
The goal of the training process is to produce lower dimensional representations that can then be used to “reconstruct” approximations of the original images.
You can think of this as a kind of compression algorithm, improvised in relation to the neural network’s experience.
The network compresses information by finding component patterns across many example images. It can then use these component patterns as building blocks to describe the whole.
This is an efficient way to store information because it means we don’t need to hold onto every detail of every image. We can use a more general vocabulary derived from all of the images to describe each particular image.
The reconstruction of an image through this process is not exact.
But that’s what’s interesting about it!
Notice how the approximation changes as we go to deeper layers of the neural network:
And notice what happens when we reconstruct partially obfuscated images:
This is somewhat like our minds filling in the missing pieces of a face that has been partially occluded by some other object such as a telephone pole.
Representing experiences through a shared set of component patterns means that we don’t have to treat each as entirely separate from or incomparable to each other.
It allows us to fill holes by borrowing from other experiences.
It allows us to make substitutions and speculate about combinations we’ve never directly experienced.
It allows us to dream!
In the animation below, another learning algorithm called t-Distributed Stochastic Neighbor Embedding (t-SNE) is learning to represent the similarities and differences between the individual leaves in a two-dimensional map.
By spatializing the relationships between the leaves we’ve experienced in this way, we can speculate on what other leaf shapes might be possible, despite having never directly experienced them. We can imagine the leaf that might exist at any position on this “leaf space” map.
Returning to our original question:
When information is lost, what is gained?
A loss of information makes the world parsable.
It makes art and communication possible.
It allows us to speculate and synthesize.
In the images below, Google’s image recognition neural network has been forced to speculation on its own speculations.
Like a Xerox of a Xerox, producing fantastical worlds.