Machine Learning for Visualization

Let’s Explore the Cutest Big Dataset

Ian Johnson
Sep 28, 2018 · 12 min read

This is a transcript of the talk I gave at OpenVisConf 2018:


Data visualization is about exposing patterns to the eye.

We are always seeking ways to tap into deeper patterns.

Patterns that feel distinctly human

Image for post
Image for post

Patterns we humans can recognize but can’t articulate to a computer

Image for post
Image for post
:)

And patterns we didn’t even think to look for

Image for post
Image for post
computer, show me only cute owls

When exploring a new dataset we have various tools in our analysis and visualization toolkit. These tools include things like averages and summary statistics, line charts and histograms, as well as an ever expanding catalog of custom visualizations.

Image for post
Image for post
some bl.ocks made with d3.js and other tools

Now I want to direct your attention to a relatively new set of tools that can change the way we explore large datasets.

Image for post
Image for post
t-SNE all the things!

These tools use Machine Learning to pull out patterns for us and give us new ways to navigate our data.

I’d like to demonstrate these techniques on my favorite dataset, Quick, Draw!

Image for post
Image for post
Quick, Draw!

If you haven’t had a chance to play the game, the rules of Quick, Draw! are pretty simple. The game asks you to draw a word, and you try to get an AI to guess the word from your drawing.

Image for post
Image for post

When the Google Creative Lab built Quick, Draw!, they had the foresight to save anonymized copies of the drawings, altering the course of my life forever. At this point millions of people across the globe have played and Google has open sourced 50 million of the drawings they created. This means we have on average more than 100,000 drawings for each of the 300 words in the game to explore.

Image for post
Image for post
some of the 300+ words in the dataset

The Data

Let’s take a close look at the data, what it is and isn’t.

Image for post
Image for post
what, when and where

This dataset makes for a great demonstration, because its so fun but it is also representative of so many serious datasets. It has categorical data, such as which word is being drawn and which country the drawing originated from. We also have a few time related dimensions, such as how long the drawing took to draw (duration), a timestamp of when the drawing was made.

Image for post
Image for post
how

We also have the sequence of points that make up the drawing. It is this sequence of pen strokes that carries most of the meaning in this dataset, they capture the way we as humans represent abstract concepts across the globe.

They are also the most difficult to dissect with traditional data visualization techniques.

Data Visualization

Just because something is difficult, doesn’t mean it can’t be done. Since the dataset was released several amazing projects applied various techniques to surface interesting patterns in the data.

How Long Does it Take to (Quick) Draw a Dog? by Jim Vallandingham

Image for post
Image for post
breaking down the data by complexity

This project explores questions around complexity and quality by utilizing stroke counts and drawing durations. Some quite interesting observations can be surfaced by interactively browsing the summary statistics of these attributes.

Image for post
Image for post
On average ducks take longer to draw than flamingos. also, owls are always cute

Note that the dimensions visualized here are the count of strokes and the duration of the drawing, both reduce the sequence of strokes to a single number. These numbers give us a manageable way to peek into the data, but can’t capture all of the features of the drawings by themselves.

How Do You Draw a Circle? by Nikhil Sonnad

Image for post
Image for post
We can look not just at what we draw, but how we draw

This article takes a deep dive into simple shapes, they determine whether each circle is drawn clockwise or counter clockwise, allowing them to count this feature in the dataset.

Image for post
Image for post
Highlighting cultural phenomenon

They can then visualize this feature to communicate some understanding about cultural differences across the world.

“There are countless ways that we subtly, unconsciously carry our cultures with us: the way we draw, count on our fingers, and imitate real-world sounds, to name a few. That’s the delight at the heart of this massive dataset.”

Forma Fluens by Visual AI Lab @ IBM Research

Image for post
Image for post
A is for Average

This project takes a number of interesting approaches to visualize the data. One in particular is the use of visual averages to highlight cultural patterns.

Image for post
Image for post
oh crap I forgot my converter

Visual averages work by drawing thousands of faint transparent drawings on top of each other, surfacing the dominant patterns. This works especially well when we filter drawings by country where cultural patterns can emerge.

Visual Averages by Country by Kyle McDonald

Kyle McDonald took the concept of visual averages to an extreme in this epic tweetstorm.

Image for post
Image for post
where is the soft-serve?

He makes excellent use of small multiples to compare patterns across several categories.

Image for post
Image for post
finding nemo

These certainly give us some interesting points to reflect on, but it’s hard to dig deeper since all of the nuanced patterns get washed out by the averaging.

Image for post
Image for post
we can’t really see anything when we average yoga. countries: USA, Korea, Germany, Brazil

So what if we had a way to capture the nuance lost by averages, to automatically find the interesting features in the strokes and dissect the data by more than one dimension at a time?

Machine Learning

Enter Deep Neural Networks. These aren’t magic, but they do have some amazing capabilities and as it turns out, we have just such a network trained on the Quick, Draw! dataset. It’s called sketch-rnn.

Image for post
Image for post
You can play with this yourself over at the sketch-rnn demo page

While it’s super fun to play with the network and come up with creative applications for a drawing machine, what’s exciting for us as data visualizers is the patterns its had to encode in order to generate its drawings.

So how do we get at these patterns?

One way to do it is to ask the network how probable it thinks a given input drawing is, as Colin Morris did in his Bad Flamingos article.

Image for post
Image for post
The Treachery of Machine Learning. Flamingos? ¯\_(ツ)_/¯

On top we see flamingos the network thinks are highly likely and on the bottom are flamingos it thinks are very unlikely. This gives us an interesting lens to look at the data through, but it still reduces all of the data to a single dimension. That’s a problem because some of the most interesting depictions of flamingos are mixed in with words that are clearly not a flamingo.

Image for post
Image for post
what if we wanted to find bad-ass flamingos?

We would like a broader view of the data, and we can get one once we understand a little bit more about how the network operates. Sketch-rnn belongs to a family of neural networks called auto-encoders which find ways to “compress” input data into a smaller representation that can then be used to generate new output.

Image for post
Image for post
The encoder takes in a drawing and compresses it into a latent vector

The network is composed of two parts, an encoder network that tries to find a way to represent the data in much fewer dimensions than the input, and a decoder network that tries to accurately reconstruct the original data using only the encoded data.

Image for post
Image for post
The decoder takes the latent vector as input and outputs a new (very similar) drawing

We call the encoded data a latent vector, and it’s the key to unlocking our technique.

Image for post
Image for post
latent vectors

We can extract the latent vector for each drawing from the network, which gives us a way to compare the drawings numerically.

Image for post
Image for post
similar faces have similar latent vectors

When we compare them we see that similar latent vectors mean similar drawings. In our network, a latent vector is 128 numbers, which is still kind of a lot to deal with. So we need a way to compare a lot of high-dimensional data points with each other.

Image for post
Image for post
A map of all our faces

Luckily there is a wonderful algorithm called t-SNE which is very helpful for visualizing similarities in high-dimensional data. It isn’t a silver bullet, but it offers us a very interesting way to explore our data. Here each drawing is represented by a small translucent yellow dot, the algorithm places similar drawings close to each other to create this two dimensional map.

Image for post
Image for post

We can zoom in on a small piece of this map, and see a group of similar drawings.

Image for post
Image for post

As humans using our eyes, the patterns here are fairly clear, namely the eyes and smile.

Image for post
Image for post

Let’s look at an entirely different cluster.

Image for post
Image for post

We can see that this cluster highlights a pretty different pattern, which is kind of sad.

Image for post
Image for post
Image for post
Image for post
the goofy dimension

Let’s get back to the idea of studying the complexity of drawings. Instead of using proxies for complexity like number of strokes or duration of drawing we can examine complexity directly.

Image for post
Image for post
cat map

Here is one way to draw a simple cat, all of these have a single stroke:

Image for post
Image for post
Image for post
Image for post

Here is another way to capture the essence of catness with simplicity, though this time with 3 strokes. In both cases you could put these in front of a young child and they would say that’s a cat!

Image for post
Image for post
Image for post
Image for post

Now we depart from metrics and get into humanities:

Image for post
Image for post
Image for post
Image for post

Here are some approximately equally complex cats, but clearly we are looking at whiskers and not smiles:

Image for post
Image for post
Image for post
Image for post

We don’t need to stop there, we can have it all!

Image for post
Image for post
Image for post
Image for post

So now we are navigating through a space that is much richer than a single dimension.

Let’s revisit the problem of averaging a concept like yoga.

Image for post
Image for post
yoga map

The problem with averages is that they assume a normal distribution with a single mode. What we will see is that there are several modes when representing yoga, starting with the different poses.

Image for post
Image for post
Image for post
Image for post
don’t forget to breathe

The way people draw even a single pose:

Image for post
Image for post
Image for post
Image for post

And the way people give up:

Image for post
Image for post
Image for post
Image for post

At this point I want to pause and step back from our particular dataset of drawings and make sure we’re clear on the two things going on here.

The first is that t-SNE is a general technique for visualizing high-dimensional data.

Image for post
Image for post
Visualizing Data using t-SNE by Laurens van der Maaten

The second thing is that Neural Networks can work on all kinds of data. In this figure, taken from Chris Olah’s amazing article Deep Learning for Human Beings, paragraph vectors are visualized with t-SNE to surface topics in Wikipedia articles.

Image for post
Image for post

So using Neural Networks to find patterns and t-SNE to visualize them is a good idea in general.

Image for post
Image for post

A more mathematical term for the high-dimensional landscape created by our neural network’s internal representation is the “latent space”. We can think of t-SNE as helping us draw a map of this space.

Image for post
Image for post

Much like a 2D map can never truly represent our 3D globe, a 2D t-SNE map won’t be able to show us everything thats happening in higher dimension.

Image for post
Image for post

But it can still be a very helpful way to navigate as we explore.

Image for post
Image for post

Here we’ve sampled a single drawing from each grid cell and opacity indicates the number of drawings in that cell.

Image for post
Image for post
simpler smiley faces
Image for post
Image for post
faces with long hair, short hair or no top

Let’s briefly bring back the idea of averaging by country codes.

Image for post
Image for post

With this view we can instead filter down by country code. We can take a quick look at Japanese power outlets:

Image for post
Image for post

If we zoom in we see the predominant representation of a “Type A” outlet with two vertical holes. Unlike in the averages, we also get to see some fun outliers which seemed to have understood power lifting instead of power outlet

Image for post
Image for post
Image for post
Image for post
left: standard “type A” plugs, right: power lifting

Let’s look at another word, octopus and revisit the idea of complexity.

Image for post
Image for post

We can filter our map to only octopi with 1 stroke, and sample from those areas. It’s probably easy to imagine drawing an octopus in 1 go like these.

Image for post
Image for post

If we instead filter to all of the complicated octopi, defined by having more than 14 strokes:

Image for post
Image for post

We find a really fun cluster

Image for post
Image for post

Conclusion

The different ways in which people draw are like different notes, the harmonics of a word and the clusters we’ve explored are the result of thousands of strangers harmonizing together.

Image for post
Image for post
namaste

Thanks

Image for post
Image for post
tiger drawings randomly assigned to thankees

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store