What is machine learning?

Charles Fisher
Unlearn.AI
Published in
5 min readMar 21, 2018

In the last few years, computers have made tremendous strides in their ability to learn and think. With a bit of effort, and lots of data, a computer can be taught to recognize objects in images, to understand analogies, or even to play games like Go at a world class level of skill. It is no surprise that these improvements to artificial intelligence have spawned a new industry of companies applying machine learning to, well, everything.

The explosion of interest in artificial intelligence has also led to an explosion of confusion. Just what exactly is machine learning, anyway?

Let’s start with a few of the common buzzwords:

Artificial Intelligence (AI): the ability of machines to perform tasks that usually require human intelligence.

Machine Learning: algorithms that enable computers to learn about the world without being explicitly programmed for individual tasks.

Deep Learning: an approach to machine learning using models that build representations of objects from combinations of pieces that can be re-used.[¹]

Machine learning is a prerequisite to artificial intelligence — intelligent computers need to be able to learn from new observations as they interact with the world. Deep Learning is just one approach to this problem that is loosely inspired by the way the human brain processes information. There are many approaches to machine learning and AI that do not use deep learning, but deep learning has proven to be an especially powerful approach in recent years.

How Machine Learning Works

At the most basic level, machine learning is about memory. To illustrate this, let’s go through an example of what may be the simplest possible machine learning algorithm — a nearest neighbor classifier.

We would like to teach a computer to recognize when an image is a picture of a car. In this case, a nearest neighbor classifier would look like this:

1. We collect a training set of labeled images; maybe 1 million pictures of cars and 1 million pictures that do not have cars in them.

2. We decide on a way to measure the similarity between two pictures. The simplest way is just to compare the values of all of the pixels.

3. We tell the computer how to make a guess about images it has never seen before. Find the most similar image in memory (i.e., from the training set).

- If this nearest neighbor image in the training set is a picture of a car, then guess that the new image is also a picture of a car.

- Otherwise, guess that the new image is not a car.

That’s it! It is a very simple algorithm.

Basically, we told the computer the answer for 2 million examples and had it memorize them. Although it works, this is not a very efficient way of learning something. What if you wanted the computer to also be able to recognize the make and model of the car in the image, or the face of the person driving it? Looking back through every image of every thing we have ever encountered each time we need to recognize something would basically take forever (not to mention a huge amount of space). And what if we gave the computer an image unlike anything it has seen before, such as a picture of a tree when there were no training images of trees? This algorithm would not fare well.

It is much more efficient to teach the computer to recognize pieces of objects because we can build many types of things using various combinations of the pieces. This is the basic idea behind deep learning.

Deep Neural Networks

A deep neural network is a mathematical model capable of learning to build representations of objects from their component pieces. For example, a neural network designed to recognize images processes the data through a series of layers. The first layer learns to look for simple features like lines or blobs of color. Deeper layers combine these low-level features to form shapes, and the final layers combine these shapes into objects.

This is an extremely efficient way to store information. Instead of memorizing the label of every single image, the computer learns how to transform the input image into an abstract representation and only has to memorize the labels of a few of these abstract classes. For example, the network can learn to recognize a car as a sum of low level features such as the patterns and textures that make up wheels, windows, and body shapes. Moreover, there are efficient ways to train these types of models, so much so that it has become pretty easy to train a good deep neural network as long you have a really big dataset with reliable class labels.

The Problems with Memorization

Deep learning works well for classification because similar observations get transformed into the same abstract representations. However, these models (and, in fact, most machine learning models) can fail spectacularly when they are presented with an observation that is very different from anything they have seen before.

In theory, it is possible to design a machine learning algorithm that is capable of continually learning from new observations. The algorithm needs to remember all of the things it has already learned that allowed it to perform well on previous tasks, find common components of different tasks so that it can re-use its knowledge, and learn any new skills it needs as it goes along. Unfortunately, current approaches to machine learning do not do this well — they specialize on a single task at a time and forget how to do this task when taught a new one.

The Future of Machine Learning

Most of the successes in machine learning during the last few years have been on problems with lots of data (e.g., millions of images) where it is possible to provide the computer with feedback about the correct answer — these types of tasks belong to a field known as _supervised learning_. There are many tasks where datasets are small (e.g., hundreds of patients in a clinical trial) or where the correct answer is unknown — these types of tasks belong to a field known as _unsupervised or semi-supervised learning_. We can think of the following analogy:

- Supervised learning techniques are like school. The computer is presented with many question/answer examples, and it will eventually be tested on some new questions it hasn’t seen before but that are similar to the ones it studied.

- Unsupervised learning techniques try to replicate the intuitive learning that we do everyday as we observe the world. For example, infants learn from experience that dropped objects will fall to the ground.

In order to send a computer to school, we need to provide a teacher. We need a lot of data in the form of questions and answers. But creating these data can be very costly — both in terms of time and money. Therefore, developing computers that have the ability to learn on their own using unsupervised techniques is a crucial step on the road to more general artificial intelligence.

[¹]: I used a pretty imprecise definition of deep learning for the sake of accessibility. To be more precise, deep learning refers to the use of neural networks that process the data through many layers of transformations. This allows the model to learn an appropriate feature space, instead of requiring human designed features. In practice, the shallow layers learn local features, whereas the deeper layers combine these local features into global features.

--

--