Explain like I’m 5 — Machine learning 😎

Burak Aslan
6 min readJun 24, 2016

--

Photo by Hannah Wei https://unsplash.com/photos/aso6SYJZGps

How exactly does a machine learn?

We hear the term “machine learning” everywhere. Whether it’s on the news or from friends, it truly has become a buzzword for our generation. But what exactly is machine learning and why should you care? How complicated is it?

Tech companies like Twitter and Facebook use applications of machine learning to create better experiences for their users (such as showing more content from your closer friends). College students of our generation are generally confused about what this term really means and defines in the computer science community.

Why should I care?

Machine learning is a subfield within artificial intelligence that builds algorithms that allow computers to learn to perform tasks from data instead of being explicitly programmed. Well, that’s a mouthful. Let’s think about it this way. I want to make a website that sells clothes. I have programmed it to show women’s clothing first for female users and men’s clothing first for male users. I also want to program it to show summer clothing first for people currently living in a warmer climate. As you can see, the complexity of the website has exponentially increased. Let’s look at our options so far.

Decision tree of a machine learning algorithm

Obviously this is not the best approach because as we cater more and more to our user base, the complexity of the decisions for the website grow bigger.

This is where machine learning comes in. We can train a computer to make the choices on its own! The ability of training is one of the most important aspects of machine learning intelligence. Translating that power to machines sounds like a huge step towards making them more intelligent. And in fact, machine learning is the area that is making most of the progress in Artificial Intelligence today; being a trendy topic right now and pushing the possibility to have more intelligent machines.

Is there anything machine learning can’t do?

Alright, so it’s not as easy and amazing as it sounds. Machine learning does still have its limits. We can’t build intelligent machines Skynet from Terminator or Hal 9000 from 2001 a Space Odyssey. However, there are plenty of examples of real applications where machine learning works and continues to push data collection.

Let’s talk about the real applications of such an abstract idea. It’s actually being used all the time in popular software.

Take Facebook for example (I hope you know what that is) people upload tons of photos everyday. The algorithm that automatically detects your face or the face of your friends is a machine learning algorithm. It learns from the photos you manually tag and recognizes features. You give the algorithm a picture and it gives back a recognizable set of features for it.

All machine learning solutions come from one problem in the following form. Given some input X, what is the correct output Y? There is also supervised learning and unsupervised learning. I am only going to talk about supervised learning, where we would show the computer program a bunch of example inputs and the correct answer to each of them (e.g. the Facebook image tagging).

The process of showing these examples to the program is called “training”. You show them to the program over and over again until it becomes good enough at predicting the right answer to something you haven’t shown to it yet. The process of trying the program to see if it’s good enough is called “testing” or “evaluation”. And we call the program a “model”.

The biggest challenges developers face in this process are how to obtain accurate training data, how to design the inputs, outputs, and internal structure of the model so that it forms a good solution to the problem we are trying to solve.

Photo by AP/Jeff Christensen http://bit.ly/1uzsHBN

Here is an example model training scenario. Let’s say I want to make program that can tell if Guy Fieri is in a photo. The first step to approaching this for us as a human would be to define what features distinguish a photo of Guy Fieri. These same features are what the algorithm will look for when assessing an image.

Some questions could be “Does it have frosted tips and/or bleached hair?”, “Does it have orange skin?”, or even “Does it have a goatee?”. These are all visual features that represent Guy Fieri being in a photo. The issue with a traditional machine learning model is this. It’s really hard for us as humans to define a right set of features. If you asked me what a “photo with Guy Fieri” looks like, I would say it probably has bleached hair, orange skin, etc… until a photo like this comes up.

Photo by rodan44 http://chzb.gr/28UMxWq

Obviously this is not Guy Fieri on the right, but how could we still tell it’s not? What “features” did our brain use to distinguish Guy Fieri from that possum? If we can’t even figure out what features are useful, how can we define good features for the model to use?

This is where traditional machine learning approaches start to struggle, and is exactly where deep learning (another subset of artificial intelligence) shines.

The magic of deep learning is that we don’t have to define these features any more. We just need to build an empty “brain” that’s structurally “smart” enough so that it can automatically figure out what features are useful and how it should use those features to make predictions.

What else can machine learning do?

Another application of machine learning is Natural Language Processing, or NLP. Natural languages are the languages humans use to communicate with each other. Everything from the suggestions that Google displays when you start writing a text to the actual results that Google displays use machine learning. It understands language by converting text into vectors (sorry for bringing you back to Calculus 1 😭).

Think of a word vector as a matrix of size N. N depends usually and roughly on the number of rules in a language under analysis. An example here is English is inferred to have rules between 300–400, so every variable in the matrix points towards a rule. These vectors contain semantic meaning. Semantic means the context of the word where it is being used.

Consider the three following sentences:

  1. Messi scored a goal
  2. Ronaldo misses a penalty
  3. Burak misses his algorithms class

The explicit, almost brute force, method of this would be to check which sentences have the most common words. The word “misses” appears in 2 and 3, so those sentences have similar context then the others, right? Wrong.

Our brains automatically understand the first 2 sentences are about soccer and the last one is about me being stupid. That is where NLP vectors shine. The vector of Ronaldo will have a value much closer to the vector of Messi. So when we find the similarities between the sentences using vectors we get 1 and 2 are the more contextually similar sentences.

I get it now!

Hopefully machine learning is less scary to you now. There are plenty of other real world applications not covered here, plenty of other machine learning algorithms and concepts to talk about, but I leave the you to do your own research on that.

Machine learning is powerful but difficult and the topics to take into account described in this post are just the tip of the iceberg. Usually a certain background in computer science and in particular, machine learning is required to obtain decent results. However, the results are priceless and irreplaceable, and with more accessibility to big data, there is no limit to what problems we can solve with these analytical models.

--

--

Burak Aslan

Working at Facebook, Apple. Previously at Dispatch, Intuit, and Attend.com