Machine Learning For Grannies

Ethan Mayer Bloom

Published in

The Startup

10 min readJun 25, 2019

A straightforward introduction for people who want to know more.

Picture this:

It’s a cozy Friday night, and you’re in the middle of dinner with your Grandma. She just finished over-feeding you with her delicious food, and now wants you to fix her Skype account. “It’s not working” she complains. Turns out she somehow managed to get the trojan horse virus. You restore her computer, create a new Skype account and everything is fine.

“Tell me what you’re doing in school!” she asks. You know there is no way of explaining things; “umm computer-related stuff” you respond. “But what type of stuff?” she asks again. “It’s called machine learning..?” you say, hoping she’ll move on. She doesn’t. “What’s machine learning?” she asks enthusiastically.

If you’re currently in this situation, DO NOT WORRY! This article serves that exact purpose — how do you explain such a concept to one that has hardly any knowledge in the field? Well, you start slow and simple.

(This article is meant for any beginner who wants to learn more about Machine Learning, not only grandmothers 😅)

What is Machine Learning?

Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use in order to perform a specific task effectively without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence.

In simple words, a machine that learns! Machine Learning is a subfield of Artificial Intelligence (AI), so we first must understand this concept.

Artificial Intelligence

I’m sure you’ve heard this term being used in many different movies — “Ex Machina”, “The Matrix”, “Tron” and so on. This term usually comes up in a negative context, such as “robots taking over the world and becoming smarter than humans,” but in reality, the industry is not at all like this (so far). We see AI around us every day: self-driving Tesla cars roaming the streets, ‘Alexa’ — Amazon’s voice recognition personal home device that you play music from, and many others. Every wondered how Netflix knows exactly what show/movie you should watch next? That’s artificial intelligence! It has impacted our world in so many ways and this is just the beginning.

Artificial intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems. These processes include learning (the acquisition of information and rules for using the information), reasoning (using rules to reach approximate or definite conclusions) and self-correction.

This industry has been growing rapidly over the past few years, and it will shape our future more than any other innovation in recent times.

Essentially artificial intelligence is the capability of machines to carry out a task that is considered “smart”, and machine learning is an application of artificial intelligence based on the idea that a machine should be able to process data and learn on its own (after we teach it how to).

Let’s dig deeper into machine learning and see a bit of what happens behind the scenes.

Types of Machine Learning

Real-life examples of Machine Learning Types

Machine learning can be divided into three different categories —

Supervised Learning
Unsupervised Learning
Reinforcement Learning

These categories are divided by how data is passed to the program we are trying to build; supervised which uses algorithms that label data based on human defined examples, and unsupervised which provides no labels to the algorithm, and the program must differentiate clusters of data based on the algorithm. Reinforcement learning is a slightly different concept that combines the two.

Supervised Learning

This method of machine learning is the most popular and the easiest to understand. As said earlier, with supervised learning we provide a set of labels (examples) to the machine which is used to process new input and give the desired output.

For example, let’s take a process we all know of — depositing checks. Of course, we’re in the 21st century and there’s no need to drive to the bank; all that’s needed is an image of the check on our phone and voilà, the money has been deposited into your account. In this situation, a supervised machine learning algorithm had to process the handwriting on the check, convert them to digital numbers and then fulfill the desired task.

How can a machine process handwritten numbers and letters? Well, the algorithm first takes in thousands of pictures of handwritten characters along with labels made by people for what every character is. A relationship between characters and labels is found (changes depending on the algorithm), and now the program can classify a new image using the same relationship. With supervised learning, we teach the machine to find a certain pattern with existing data, which can then recognize new data using. One obvious downfall with this method is the need to have lots of data to create a learning algorithm. With handwritten characters, a human needs to label thousands of different images, which requires some work.

Supervised learning can be divided into two subgroups: regression and classification.

Regression

Regression actually derives from statistics — it’s the technique of predicting values of a desired target quantity when the target quantity is continuous. Essentially, a regression model can find the value of something based on the values of similar things.

For example, we want to predict what the annual income of someone will be. We take in all the important information of that individual: years of education, age, location, and whatever else is relevant; these attributes are called features, and can have different values based on their category (numerical, binary — “yes”/”no”).

Supervised learning is based on known data, so we first need to input many different people with their features and annual income; “David has 3 years of education, is 24 years old, and lives in New York. His annual income is 120k”. With many more of these, the regression model will build a relationship between these features and an annual income.

Once a pattern is found (like in the graph above), we’re able to predict the income of someone new based on the pattern the model has found. Pretty neat. This sounds relatively simple and easy, but the more difficult part is calculating the relationship between target values and features. To do so we have a few well-known supervised learning algorithms: Linear regression, polynomial regression, support vector regression, decision tree regression, and more.

Classification

Classification models don’t try to predict a value, but they attempt to draw some conclusion from observed values. They predict a label; “is this email spam or not?”, “will the user click on this ad?”. In order to predict if an email belongs to the “spam” class, it has to first find the pattern of other spam emails.

These models can never be 100% accurate, as machines can’t make decisions as our brains can. It all depends on the classification model, and which features are used to categorize groups. For example, if we were trying to build a model that would recognize if an animal is a reptile, we must decide which features make an animal a reptile. Let’s say we use “cold-blooded?”, and “lays eggs?”; if our input data meets these two features — we can categorize it as a reptile. Crocodile? Works. Lizard? Yup. Salmon? Wait… a salmon fish is cold-blooded and lays eggs, but is not a reptile. Our classification model has produced a false positive. In this case, we would increase the features of our model to make it more accurate.

Different models have more or less significance to false positives/negatives. In the case of a cancer-finding model, we would need the least amount of false negatives. We would rather our module produce a positive and find out it’s actually negative with further checking.

We have several different modules used for classification, each different depending on the type of data we want to predict.

KNN — K-Nearest Neighbors

Let’s go a bit in depth into one of the many supervised learning algorithms. This is one of the simplest ones, some don’t even consider it as machine learning because of its simplicity. The idea is very basic — labeling a data point based on the class of its k nearest neighbors. This algorithm could be used in regression or classification models, although it’s mostly used for classification.

In this example, the input is a green circle — we need to categorize it as one of the two present classes, blue squares or red triangles. Instead of calculating a complex relation of the existing data and analyzing features of the new input, we simply look at the closest items.

For k=3 (i.e. in the 3 nearest neighbors), we find two red triangles and one blue square. In this case, we would predict that the mysterious green circle is a red triangle since they are most of his closest neighbors. If we use k=5, we find 3 blue squares and 2 red triangles, therefore we label our green circle as the most common blue square. This is the entire idea behind k-nearest neighbors.

Deciding what k is will change the output. This factor has a major impact on the errors of the algorithm, as we don’t want to over-fit a new input or neither over-estimate.

The steps for this algorithm are the following:

Load the existing data. Create some graph dividing points by their different features.
Pick a value for k, based on the data we currently have.
Find the most frequent class of the surrounding k neighbors of the new data point.

Most of the time data can’t be divided so easily on a 2D graph, and we must find some other way to calculate the distance between a new data point and all the existing ones. For instance, if we’re trying to predict the price of a house, different factors can’t be plotted on a graph and we’ll need to find a different method; some of these include Euclidean or Manhattan Distance.

Unsupervised Learning

During certain situations, we won’t have user-based data to begin with. If a new product comes out and we need to predict something about it, we have no way of analyzing past information without any data. We refer to these cases as unsupervised (because the data is unlabeled) learning, in which the model learns how to find the underlying structure of a dataset. There a few common tasks in unsupervised learning, with the most common being “clustering” the data into groups by similarity.

Unsupervised learning is much more prone to false predictions, as we have no idea what the output is supposed to be.

There are various clustering methods — k-means clustering, hierarchical clustering, and several others. Check this out if you want to learn more about unsupervised learning.

Reinforcement Learning

We won’t extend this topic very much as it’s less popular in Machine Learning. Reinforcement learning is the computational approach of learning through trial and error — learning from action.

Think of how humans learned things through interaction with the environment. For example, we see a fireplace and it looks nice. We get closer and it warms us — a positive experience. We get a bit closer and it burns — a negative experience. We understand that fire feels good from a distance. This interaction system is implemented in reinforcement learning, with the goal of maximizing long-term reward.

A machine example would be a robot trying to walk. It takes the first step but falls. This becomes the first data point in our reinforcement learning program. Since the result was negative, the system tries to adjust itself to produce a positive, such as taking a smaller step.

There are different approaches to reinforcement learning, such as Markov Decision Processes (MDP), Q-Learning, Policy Learning and more.

Neural Networks and Deep Learning

Machine Learning contains a recently popular subfield— deep learning. This technique tries to teach computers to do what comes to humans naturally, learn by example. It’s been getting a lot of attention lately and for a good reason; things have been achieved that weren’t possible before.

A deep learning model learns how to classify tasks directly from images, text, sound or other forms of input. Well-built models can achieve extreme accuracy, sometimes even exceeding human-level performance.

Neural Networks are a set of algorithms usually used in deep learning, and they try to implement processes of the human brain by recognizing patterns in data through multiple layers. They are usually referred to as ‘deep neural networks’, as for deep learning data can go through as many as 150 hidden layers before producing any output.

This is a very basic introduction; if you would like to learn more and understand how it all relates to the human brain — I recommend starting here.

In Summary

When robots become smarter than us and take over the world, at least you’ll know why! Machine learning is a very broad concept that contains even more subconcepts and is part of an even broader concept (AI); in this article, we briefly went over the basics of Machine Learning, and if you found it interesting I would recommend digging deeper:

Machine Learning algorithms and implementations
If you’re ready to get into programming, check out Tensorflow — an open-source machine learning library for research and production
An awesome machine learning program by google — quickdraw

If you found this article helpful or have some suggestions to improve it, feel free to reach out on twitter!