Understanding How Machines Learn, Through Prototyping
by Big Tomorrow | January 23, 2017 | More Articles
Machine Learning — This is the second article in a larger series exploring the intersection of design and existing artificial intelligence technology through experiments, prototypes and concepts. We believe this is a critically important topic for the design community and beyond, so we’re sharing what we learn along the way.
Let’s start by getting something out of the way: we’re not machine learning experts — we don’t publish research about new algorithmic breakthroughs and we’re not especially good at math. But we’re curious about what to do with all the machine learning capability already floating around out in the world, and we’re bullish about how far a ‘good enough’ understanding can often take you.
So how might non-experts begin to play with machine learning? Is there a way for designers to develop an intuition about its opportunities and constraints through direct experience? Shortly before the New Year we resolved to sort this out. We made a quick video about what we found —
If you watched the video, you’re now decently familiar with The Brain, a prototype by designer Mike Matas that can learn to recognize drawings and associate them with emojis. Here’s a demo of it on Youtube.
Mike’s prototype, a great example of how machine learning can be simply articulated, reveals some core machine learning principles without descending into jargon, and it’s built in a familiar prototyping tool called Quartz Composer. All encouraging, but he doesn’t provide the full picture of how it works. Inspired by his work, we wanted to better understand what he’d done and set out to see how far we could get reverse engineering it.
To recreate The Brain, we first had to tease apart its three key components:
- Drawing Tool — Mike provides a drawing tool so users can teach The Brain how to recognize different types of drawings.
- Brain — The Brain itself is powered by a neural network, a machine learning approach that roughly mimics how our brains work.
- Perception — In order for The Brain to learn what smiles or frowns look like, it needs to be able to see drawings.
We then had to figure out how exactly these components function and problem solve ways to rebuild each one. Framer was our go-to choice for prototyping because its accessibility, flexibility, and supportive community allow you to quickly experiment and borrow others’ work.
Drawing software usually works by capturing the x,y position of a cursor, stylus, or finger during a pressed state and adding those coordinates to the screen as individual dots or points in a path.
Since Framer prototypes are web-based, we thought perhaps we could accomplish this by generating SVG paths. We weren’t sure exactly how to do it though, so we looked for existing work in Framer’s Facebook Group that might help. Thankfully we stumbled upon a drawing tool by Callil Capuozzo that we were able to modify to serve our purposes — more on how we did that when we get into the code below.
After examining a number of candidates, we settled on the awesome Brain.js library by Heather Arthur — though it’s probably the least flexible of all the ones we looked at, it has the easiest-to-understand interface:
- First you structure and label input data (i.e. the things you want to train the neural network to understand).
- Next you do some light configuration of your neural network’s inner workings and feed the data into its input layer so that nodes within hidden layer(s) can learn to recognize patterns — a deep neural network is where you use multiple hidden layers to look for patterns in patterns.
- Lastly, you run test data (i.e. the things you want the neural network to interpret) back through the trained network and the hidden layer(s) determine the probability the data resembles something in the possibility space of patterns it learned via an output layer.
We were able to easily pull Brain.js into our prototype with Node.js Package Manager(NPM). Before we could start training it, though, we needed to figure out how to get the Brain to see drawings…
The overarching mechanics of the Drawing Tool and Brain components of Mike’s prototype were fairly self-evident, but how the Brain sees drawings was less obvious. We deduced two possible way:
- Pixel Investigation — Placing a fixed-size boundary box around each drawing allows you to iterate through every pixel within that box and assign values based on the color found within each one. You then add these values sequentially to an array giving you drawing data to start manipulating.
- Path Investigation — Since x,y coordinates of every drawing are already captured while we draw, that path data might be sufficient for the neural network to find meaningful patterns.
After the drawing data is captured either through Pixel or Path Investigation, it then needs to be normalized so the neural network can make sense of it. We wrote some custom functionality to address a few constraints that come with Brain.js:
- Mapping Values — Pixel data is captured either as RGB values that look something like
[r:0, g:0, b:0]or x,y coordinates that look something like
[x: 200, y: 133], [x: 201, y:135], ...But the neural network only understands values between
0 — 1, so we had to map ranges and flatten values.
- Pruning Data — Since drawings are user-generated, they can be lots of different sizes. This is problematic for the neural network, because it can only understand arrays of the same length. We reconciled this issue by taking the length of the shortest drawing and pruning data from the other drawings to match. The challenging part here was to prune data that wouldn’t corrupt the neural net’s understanding of a drawing’s form (i.e. instead of just cutting a bunch of data off the end of arrays, we selectively removed data from different parts of each one).
We tried both Pixel Investigation and Path Investigation, and decided to proceed with Paths after they gave us better results. Once the Drawing Tool, Brain, and Perception components were worked out, it was time to stitch everything together.
Making It Work
Download the prototype to follow along. To run it in Framer Studio, you’ll need to import Brain.js using NPM — instructions below. From here on out, the word ‘layer’ refers to the Framer ‘layer’ construct, not to be confused with neural network layers discussed earlier.
Design without limits. Bring your projects to life with Framer.share.framerjs.com
We’re going to walk through our prototype step-by-step as though creating it from scratch, but let’s zoom out for a moment to get a high-level view of its structure:
- Import Brain.js Library (Brain)
- Drawing Tool Setup (Drawing Tool)
- Data Parsing Setup (Perception)
- Capturing Teach Data (Perception)
- Capturing Play Data (Perception)
- Neural Net Setup (Brain)
- Training Neural Net with Teach Data (Brain)
- Running Neural Net with Play Data (Brain)
Ok, let’s dig in.
1. Import Brain.js Library
Start by creating a new Framer project and saving it as
brainproto.framer on your desktop. Then open terminal (fear not, we’re basically just copying & pasting) and follow Brain.js install documentation to retrieve the library with NPM — if you’ve downloaded the project, you’ll need to do this as well.
npm install brain
Note: Don’t have NPM installed? Easy to do with Homebrew. Instructions here. You may need to have the most current version of Xcode to install Node.js on OSX (even though we won’t be using Xcode at all — go figure).
Once that’s straight, go into the project’s folder structure and create a file called
npm.coffeee in the
Modules folder with a text editor. Framer documentation instructs to add a single line of code, save, and close it.
The project’s folder structure should now look like this —
Return to the Framer prototype and import Brain.js after a refresh.
We’ll finish configuring the neural network and start training it once the Drawing Tool and Perception components are setup.
2. Drawing Tool Setup
A few parts of the Drawing Tool have starting states to plug in.
Even though more data is generally better with machine learning, through trial and error we discovered our neural network only needs about four training drawings to learn what a smile, frown, or tear looks like. Better programmers would probably create a function to let the user train as many drawings as they want, but we’ll just hard-code a few arrays to capture the bare minimum data needed to complete our task.
Add some visual elements to build out the graphical user interface.
The prototype should now look something like this —
This is the part where we borrow some of Callil’s Drawing Tool code.
Almost ready to start scribbling. Just need to make sure the data captured while drawing is converted into something the neural network can understand.
3. Data Parsing Setup
We already mapped the values. Let’s make sure the arrays we’re feeding the neural net are all of equal length. To accomplish this, we’re going to wait until the user has finished all the training drawings and then find the one with the shortest array length — that’s our standard. We’ll then prune select points from all the other drawing arrays until they match the standard. Solution’s a little ugly, but it gets the job done!
4. Capturing Teach Data
Now to use all the nice functions we setup. Tell
teachDraw to listen for the user to start drawing in the bottom half of the screen. Then we’ll simultaneously draw the path as an SVG and parse captured x,y coordinates in preparation for neural net training.
When a drawing’s finished, we check where it is in our training flow.
How do we do this? Each time a new training drawing finishes, increase
1 so data from the user’s next training drawing will land in the next empty array in
faces. This allows the prototype to track how many of the 12 training drawings the user has finished and change the guide text and emoji accordingly. When data for all 12 training drawings has been captured, we execute our
normalize() function to prune training drawing arrays to the same length.
5. Capturing Play Data
Once we train the neural network, we want the user to be able to create test drawings and see if it can recognize them. Mike’s prototype gives the user its most confident result in the form of an emoji that pops up after the drawing’s finished.
To recreate this effect, process the test data using the same functions we applied to the training data. Then set up an animation to fire near the position where drawing press is released — the neural network will tell the animation what emoji to show later.
6. Neural Net Setup
We finally have all our training and test data — let’s configure the neural network, train it, and run/test it. We have to feed all our training data to Brain.js in one go, so take all the populated and cleaned training arrays from
faces and label them. This allows the neural net to interpret frown drawings as ‘sad’, smile drawings as ‘happy’, and tear drawings as ‘cry’.
Now that all our training data is labeled and contained in
data, we’ll follow Brain.js documentation to create our neural network and setup a function to train it. To be perfectly honest, we’re not entirely sure how exactly all the parameters impact training — you can play with
learningRate to adjust your neural network’s performance.
Note: Training can be computationally expensive, so tweak parameters conservatively. For example, if you set
iterations to be >
100, training time may take more than a minute —serious neural networks can take days to train. When we tried training pixel data, the input datasets were huge and it locked up our CPU. With the path data that’s currently configured, though, we’re training pretty small amounts of data, so there shouldn’t be cause for concern.
7. Training Neural Net with Teach Data
Create a button that will appear when the user’s training drawings are finished — clicking it will initiate the
The user will get text feedback when the neural network has finished training and then be prompted to create test drawings.
8. Running Neural Net with Play Data
Make a function that will pass the data captured from test drawings into the neural net so it can compare this to what it’s learned. The net will return a probability that each new drawing resembles a smile, frown, or tear. Populate
emojLyr with the emoji that corresponds to the highest confidence drawing type before it pops up.
playIt() function gets executed upon test drawing press release — we already set that up, which means we’re pretty much done!
For best results, we suggest drawing fairly slowly using a desktop browser. Be aware that you only get a single stroke for each drawing. Also, try packing as much variation as possible into your four training drawings — the more variation the neural network sees, the larger the possibility space of what it understands.
Lastly, we didn’t normalize for absolute position of x,y coordinates (admittedly kinda dumb, but we’ve only got so much time to noodle), so if all your test drawings are done in the same place you might not get great results. Instead, vary the position of test drawings left-to-right like so —
As mentioned, this article is part of an ongoing series we’re doing on artificial intelligence and its impact on design practice. Check out our first installment —
We’re looking for collaborators with interest and background in the application of artificial intelligence to human-centered design, and we’re happy to facilitate any conversations on the topic. You can reach us at firstname.lastname@example.org. If you’re having any trouble with the prototype feel free to get in touch.
Video by Matt Herald, Drew Stock, and Christian Mulligan. Prototype and tutorial by Drew Stock. Thanks to Mike Matas for his original prototype. Thanks also to Koen Bok, Kenneth Adorisio, and Nicolas Arcolano for their feedback.
Big Tomorrow is a design consultancy based in Austin and San Francisco.
We help organizations solve complex challenges by building experiences that improve how people live, work, learn, and play. We’re design thinkers and doers who uncover opportunities, accelerate growth, and deliver meaningful results. Think we can help? Get in touch.