Visualizing Neural Networks In Virtual Space

“You guys may be surprised to hear this, but I didn’t really get into tech until college.”

The voice belonged to Cassie Tarakajian, Research Resident at DBRS Innovation Lab. We were walking with Lab director Amelia Winger-Bearskin underneath the Brooklyn Queens Expressway. The roar of the traffic above co-mingled with the chalky sound of wheels on asphalt as a pack of middle-aged skaters cut in front of us on their boards.

I was a little bit surprised to hear that Tarakajian was not born a technophile. She is after all an accomplished software artist who holds a degree in electrical engineering from Johns Hopkins and speaks regularly at various venues and events in the New York City arts and tech community. (I first heard her speak at Pioneer Works for Software for Artists Day, where she was a resident.) Her current project at the Innovation Labs is to produce a visualization of a convolutional neural network using a virtual reality headset — not exactly kids’ stuff.

But the reason this memory stuck with me so clearly has less to do with its content than it does with its context. Thanks to the enclosure created by the highway overpass and the sudden awareness of our surroundings that sprang into being when we had to dodge full-grown men on skateboards, I have a clearly delineated frame of reference for that experience. The memory lives within the boundaries of its own world, and when I reach back to retrieve it I have a spatial framework where I can find it.

This is called spatial memory, and it has everything to do with Tarakajian’s project at the Innovation Labs. By setting out to visualize a neural network in virtual space, she is exploiting some of the same brain functions that allow us to navigate new spaces and form mental models of them.

“A lot of my research for this project has centered around UX design and how people understand and interact with space,” she told me. “Humans exist in three-dimensional space, and therefore we think in three-dimensional space. It is also a form of external memory. When we order items in space, we don’t have to remember which object comes first in the sequence. This is why people are more productive when they have multiple monitors, for instance. This is also why a lot of applications move from window to window…”

Cassie Tarakajian tests her virtual environment (Image: Fletcher Bach)

Anatomically, that reason is the hippocampus — the part of the brain responsible for spatial navigation as well as certain types of emotional regulation. The hippocampus governs the brain functions that allow us to form abstractions about a space’s configuration and remember events (as we know from cases in which a patient’s hippocampus had to be removed, leaving him unable to form new memories for facts and events.)

Virtual reality would be impossible without these abstractions in which space and memory are fused. As a medium, VR depends on the user’s ability to connect the implied contours of an imaginary environment to her memories of how space really functions.

I was not surprised to learn what Tarakajian has been reading lately: Italo Calvino, Jorge Luis Borges, Haruki Murakami, Magical Realism — all literature that delights in creating imaginary spaces and blurring the distinction between memory and fantasy. She told me that she had been “considering VR as a storytelling medium… designing stories as experiences, and what sort of impact that could have.” I imagined Calvino’s Marco Polo strapping an Oculus headset onto Kublai Khan and booting up a Unity sketch of his travels.

Even though her initial interest in VR came from the world of fiction and storytelling, Tarakajian is experimenting with the medium’s relationship to narrative to imagine alternative use-cases for it. “I’m looking at existing VR projects and asking myself if VR can be used for anything other than films and games… All VR is is an interface to 3D space. There’s no reason it’s constrained to [narrative] storytelling.”

Tarakajian’s current project is an exploration of VR technology as an educational tool. “What if we could visualize neural networks in a different way? We have a number of graphs and so forth that make sense to people who already understand how neural networks work.” For example, Adam Harley’s project is another 3D visualization of a neural network. Tarakajian cites Harley’s project “the foundation” of her own, but worries that it might be a bit inaccessible to neural net novices. “What if you wanted to explain [a neural network] to someone who knows nothing about them?”

For those readers who know nothing about how neural networks function — they are computational algorithms modeled after the human brain that are able to “learn” with minimal human intervention. “Neural networks are beautiful because they don’t require a human engineered feature set,” says Jamis Johnson, Machine Learning Scientist at the DBRS Innovation Labs. “They have layers that will find higher and higher levels of abstraction.”

A convolutional neural network identifying the content in a picture (Image: Fletcher Bach)

In the context of neural networks the word “abstraction” refers to a hierarchy of matrices that the network generates in the hidden layers between the data that humans feed in and the result that it returns. A network trained for image recognition (like the one that Tarakajian is using) takes in information about the pixels of an image file and identifies patterns about small clusters of pixels at a time. The set of patterns that the computer finds constitutes its own layer, which is then subjected to the same kind of analysis as the layer beneath it. In this way the network builds its own working definitions — contrasting pixels become edges, edges become lines, lines gradually assemble into higher and higher level features until the computer is able to recognize the contours of the human face, a trend in the stock market, or in Tarakajian’s example, the shapes of numerals.

“Neural networks are the new hotness in Machine Learning,” Johnson told me later. They are already in wide use doing work like image recognition, language processing, and pattern analysis on an enterprise scale in a wide variety of industries. And engineers are finding more and more uses for them every day.

Machine Learning techniques such as neural networks are poised to automate a huge number of tasks in the near future. A study out of Oxford University estimated that 47% of currently existing jobs will be subject to automation within a generation, although other experts predict this number to be much higher.

(Image: John Farrell)

As machines develop increasingly sophisticated capabilities, it becomes even more critical for humans to be aware of how they operate. “It’s important for the public to understand how all these different algorithms work,” Tarakajian said. “They’re not that complicated. You can have a high level understanding without being a developer or a scientist.”

Still the exact way they function is a little bit difficult to explain. Personally, I had to speak to a half dozen machine learning scientists and consult scores of online sources before I arrived at my own highly provisional understanding of how neural networks operate. I’m sure you are smarter than I am, but I bet you wouldn’t mind if someone like Tarakajian were able to make the process more like taking a factory tour and less like pulling an all-nighter in Borges’ Library of Babel.

Tarakajian’s project works with a convolutional neural network called LeNet. Convolutional neural nets are a type of feed-forward network modeled after the visual cortex of an animal. It is defined by a set of layers, the outputs of one layer connected to the inputs of another, which perform a different action in the image classification process.

An illustration of the process by which convolutional neural networks form layers of abstraction. (Image: Fletcher Bach)

The network that Tarakajian has been using in her project was been trained on the MNIST dataset, a collection of hand-drawn numerical characters. The network is exposed to thousands of examples of numerals in various handwriting and is gradually able to abstract an “understanding” of the defining characteristic of each number.

“You have some input images — ” Tarakajian explained, using her hands to invoke a spatial metaphor on the surface of the table, “ — in this case the inputs are images of the numbers — and then in the convolutional layer, you have some smaller matrix that you’re sliding across. You’re taking out a portion of the image, and you’re transforming it with another matrix, which is called a filter, to extract features. The next layers pool these features, and figure out which features the image contains the most of. According to a CNN, a number is classified by a set of features.”

An illustration of the LeNet convolutional neural network (image from the website of machine learning pioneer Yann LeCun)

Users of Tarakajian’s VR visualization are able to draw their own numbers with their hands (the movement of their fingers is captured by a Leap Motion sensor) and then watch in real time how the algorithm analyzes their writing. Users can view the neural network from different angles and change the view to see all of the networks layers at once or zoom in on a specific layer.

“All of these neurons activate in the same way they do in your brain. If there is a high enough electric potential across the neuron it will fire.” I am straining to keep a mental model of the neurons in my brain. I try to imagine my own visual cortex, and keep my eyes from glazing over.

“In the case of artificial neural networks, the output of a neuron is zero until an input threshold is reached, and then the output is positive. While it may be hard to think about how this simple building block can be made into something as complicated as an image classifier, think about silicon transistors. Their function at a low level is also simple: they act as gates, either open or closed, allowing the flow of current or blocking it, based on some input. We interact with the complex systems they create on a daily basis, on our phones, computers, and other devices.”

I understand this in the most abstract possible way, but I want to see it, interact with it. By this point I am thoroughly convinced of the pressing need for the kind of work that Tarakajian is doing. “It’s a way of translating abstract computer concepts into an interface that feels more natural, sort of like how you would naturally inspect an object.”

Tarakajian’s project in action (Image: Cassie Tarakajian)

At this point the project is more useful as an educational tool, but in the future it might develop into a new way for businesses to supervise the training of the algorithms that power their products. The Leap Motion sensor allows the user to interact with the visualization as if it were a 3D touch screen showing which neurons are firing at what points of the network. One possible use would be adjusting the weight of different parts of the algorithm in real time.

When I spoke with the rest of the DBRS Innovation Labs team about Tarakajian’s project, Winger-Bearskin brought up an interesting point I had not considered: “Machine Learning will become a common methodology for doing all sorts of things, from diagnosing illnesses to financial analysis. We hope in time projects like Cassie’s will help to bridge the gap between those who understand how neural networks work and those who are authorized to license those methodologies as valid in various fields.”

“In the past we would have the algorithms as human-engineered features,” Johnson explained, “but now they are machine learned features. A convolutional neural network automatically extracts its most important features.”

“These are quickly becoming the most powerful models,” added Jen Rubinovitz, Machine Learning Scientists at the labs. “But they’re unsupervised. With visualizations like Cassie’s you can start abstracting those rules and be able to intervene in the algorithms.”