Understanding Deep Learning with TensorFlow playground

Andrew T
3 min readMay 10, 2018

--

(Check out my machine learning web app: Observation.ai)

The TensorFlow playground can be used to illustrate that deep learning uses multiple layers of abstraction.

Or more precisely from a Nature Deep Learning review:

Deep-learning methods are representation-learning methods with multiple levels of representation, obtained by composing simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level.

First, notice blue represents +1, orange represents -1, and white represents 0.

Let’s start with the default classification example. There are 4 datasets.

The four datasets: circular, 4 quadrants, 2 clusters, and a swirl

The datasets all have 2 input features and 1 output label. The 2 input features, X1 and X2, are represented by the coordinates. X1 is the horizontal axis and X2 is the vertical axis. You can infer that from the feature inputs below.

Graph of input features: X1 and X2

The output label is the color of the dots, blue (+1) or orange (-1).

Features: X1, and X2 the horizontal and vertical axes. The label: blue(+1) or orange(-1) dots.

The first 3 datasets can be solved with the default setting with 2 hidden layers. However the 4th, the swirl dataset, can not. When you click the play button it is actually training a neural network that runs in your browser. The background color of the output changes from light shades (representing 0) to blue and orange patterns that illustrate what the network will predict for new input.

Circle dataset solved in 100 Epochs
4 quadrant dataset converges on accurate prediction in 111 epochs
2 cluster dataset converges on solution
Swirl dataset doesn’t converge on solution with default settings. Test loss is 41.6% after 3,509 epochs

So how can we get the swirl dataset to converge on a solution?

One way is to do feature engineering. Add new input features. You take the input features and square them, multiply them together, take sin and cos and feed them into a shallow neural network. This represents classical machine learning and feature engineering.

Multiple features

Another approach is to use deep learning. You don’t cleverly manipulate the input features (a process that gets much more difficult on less trivial examples). You simply only use the raw inputs, X1 and X2. But you add more layers.

Deep 6 layer network converges on swirl dataset with only X1 and X2 features

Notice in this example we start with the raw input and have multiple layers of increasing abstraction. You can see the first layer of neurons have simple patterns that detect boundaries. They mostly cut the background in half at various degrees of rotation. Then as you look to the right the patterns are more complicated. The first layers detect edges, the middle layers detect circles from the edges, and the last layers detect swirls from the circles. It creates…

a representation at a higher, slightly more abstract level

Inspired by a comment on hacker new.

--

--

Andrew T

Entrepreneur, programmer, electrical engineer, investor.