#Deepdream is blowing my mind
TLDR version: It’s not the aesthetics.
Having a little bit of understanding of what a deep artificial neural network is and how it works might help make this article a bit clearer. I try to explain it in non-technical terms here. If you’re not familiar with the territory, feel free to have a quick skim through it before coming back here.
First off I should make it clear that I really have nothing to do with the #deepdream #inceptionism thing that’s been going around lately.
I have tweeted about it, and made a few images and videos, but they’re all just using the fantastic research and code provided by Alexander Mordvintsev, Christopher Olah and Mike Tyka. They have a nice explanation of #deepdream with some stunning images on their blog post.
And it’s blowing my mind!
But I should also point out: it’s not the aesthetic that’s turning me on.
The aesthetic is interesting. It’s trippy, surreal, abstract, psychedelic, painterly, rich in detail. But the novelty is likely to wear off quickly for most except for the specially dedicated. Using new datasets or learning to control the output (so it’s not just puppy-slugs) can undoubtedly give it a boost. And there is potential to do very interesting work conceptually pairing datasets with seed images. But it’s not the aesthetic that excites me.
Instead, like I said recently…
…the poetry behind the scenes is blowing my mind!
And I will try to explain why...
In non-technical terms here’s what’s happening in #deepdream:
- An artificial neural network (the AI’s ‘brain’) has already been trained on over a million images
- We show the trained network a brand new image
- While the network is processing this new image, we take a snapshot from a particular group of neurons inside the network
- We feed that new snapshot image back in, i.e. show it to the network (We can optionally apply small transformations, like zooming in etc.)
(If interested, see here for a more detailed non-technical technical explanation)
The poetry is blowing my mind at every step of the process…
When an artificial neural network receives an input such as an image, it tries to make sense of it based on what it already knows. The image data flows through the network, ‘activating’ neurons. Effectively the image is ripped apart and scanned for features that the network recognizes. This can be thought of as asking the network “Based on what you already know, can you see anything here that you recognize?”.
This of course is how we make sense of the world. It’s no different to asking us to recognize objects in clouds or ink / Rorschach tests. But it’s not just visual. We try to frame everything we see, hear, learn within the context of what we already know, and we build on top of that. This can be purely visual like seeing faces in clouds. Or it can be more critical as it affects how we learn, make decisions, construct theories or develop prejudices based on the limited knowledge that we have. If we don’t have sufficient information, the assumptions we make are likely to be incorrect, as are the decisions we make as a result of them.
2. Confirmation Bias
When the network is processing the image, some of these recognitions might be weak firings within the network. These weak neural firings can be thought of as almost sub-conscious level “I think I see a little bit of a lizard-like texture over here, perhaps something that resembles a bridge over there”. But if these are very weak signals in the deep layers, they’ll dissipate within the network and won’t elevate to higher layers, or influence the final output.
But in the case of #deepdream, we choose a particular group of neurons dedicated to detecting particular features — e.g. those which respond to lizard like features — and we take a snapshot image, from inside the network, inside the AI’s brain. Whatever features a particular group of neurons respond to, will be exaggerated in the snapshot of those neurons. (NB. Technically speaking, by ‘take a snapshot’ I mean we choose a group of neurons and we modify the input image such that it amplifies the activity in that neuron group. See my other article for more non-technical technical info).
This snapshot shows what that particular group of neurons are imagining.
When we feed that snapshot back into the network as a new input, the network recognizes those features with more confidence, because those patterns in the new image are now stronger, so those same neurons fire stronger. And when we take another snapshot of the same neurons, and feed that in, it becomes even stronger. What was an initial “maybe I see inklings of little lizard-like features over here” on a deep sub-conscious level, starts to become “yea, I think they might be lizard-like features”, to “oh definitely, that’s a lizard-skin puppy-slug” at a well defined, visible high level.
This creates a positive feedback loop, reinforcing the bias in the system. Building confidence with each iteration. Transforming what was subtle, unnoticeable trends deep within the network, to strong, visible, defining biases.
This is almost like asking you to draw what you think you see in the clouds, and then asking you to look at your drawing and then draw a new image of what you think you are seeing in your drawing. And repeating this.
But that last sentence was not even fully accurate. It would be accurate, if instead of asking you to draw what you think you saw in the clouds, we scanned your brain, looked at a particular group of neurons which we know responds to a particular pattern, then we reconstructed an image based on the firing patterns of those neurons, and gave that image to you to look at. And then we scanned the same neurons again to produce a new image and showed you that etc.
The critical difference is, if we’d asked you what you saw in the clouds, we’d be representing the final conscious decision you made regarding what you saw. Whereas by scanning and extracting the snapshot from a group of neurons, we’re preying on and amplifying a particular thread of thought to create a strong bias. Like an indoctrination on a neurological level.
This of course is analogous to so many aspects of how our mind functions already. We see the world through the filter of a biased mind. A product of our upbringing, everything we've ever seen or learnt, the culture in which we live or come from. We project this bias onto everything we perceive, and if we’re not very careful, everything we perceive will in turn reinforce the very bias that shaped it.
The face in the clouds look more and more like a face the more we think it’s a face. The shadow in the alley looks more and more like a mugger the more afraid we become. The image of the virgin mary on a piece of toast is more tangible the more we want to believe in it. The more convinced we are of a certain hypothesis, the more inclined we are — subconscious or not — to find that every piece of evidence confirms that hypothesis.
Interestingly, even if you don’t agree with my previous points, you've probably already confirmed them. If you see a human or ape like face in the image below; or [bird, slug, reptile, worm, puppy, sloth]-like creatures; then you have just demonstrated it. Perhaps you see something else? Something I can’t see? Then you've confirmed my point even stronger.
There are no faces, birds, slugs, reptiles, worms, puppies, sloths in the image above.
Your mind is projecting those meanings, trying to recognise patterns based on what it already knows, what it’s been trained on. Neurons in your brain stimulated by different features of these abstract shapes are trying to make sense of what you’re seeing and frame it in context of something familiar. (NB. Also see See apophenia and pareidolia).
Just like the #deepdream neural network.
You’re looking into a mirror of your own mind.
4. Completion of the cycle.
Even more interestingly, remember that these images generated by the #deepdream process are not what the network is seeing on a high level. These images are extracts from inside the network. Abstract representations from the depths of its memory. Snapshots from inside the AI’s brain.
These were weak neural firings, that we artificially amplified. Without us interfering, these firings might not have even elevated to higher levels, and would have remained latent in the network. On a higher level the AI might not even be aware of these features.
But then you, find patterns in these images. You project meaning onto this noise, the abstract representations extracted from inside the hidden depths of the network. In recognizing these forms, you are confirming what the #deepdream neural network thinks but doesn't know that it’s seeing.
You are completing this cycle of recognition in your mind.
5. Acknowledgement of Self-fulfillment
A final word. I might be wrong. All of the above might be complete rubbish and you might think I'm reading too much into it. But in my mind this is what I see. It’s what I believe. It makes complete sense to me.
But maybe I am wrong. Maybe I'm making some incorrect assumptions somewhere, maybe I'm ignorant, or maybe I'm making downright logic errors. Due to something weird in my brain it’s just not computing the information correctly. Maybe there’s some kind of bias that’s distorting my view. A bias that’s there because of everything that I’ve learnt and been subject to in my life so far. So maybe I am wrong, and maybe because this is what I already believe, I'm projecting it onto what I see here. Then this discussion is just reinforcing this very bias to further confirm everything that I just stated, in an infinite positive feedback loop where I can feel every step of the process being mirrored in my own mind as the neurons in my brain spike like crazy as they recognize the confirmation bias of the process reinforcing the hypothesis and projecting back onto my speculations which match exactly the so-called predictions to complete the cycle and when I think about how that relates to everything that I've just said…
…the poetry is blowing my mind.
Have a nice day.