An introduction to Machine Learning and Artificial Intelligence for everybody
People that work or study in Science, Technology, Engineering or Mathematics are often confronted with the same problem: it’s hard for their relatives to understand what they really do in life. Unless of course they studied the same fields. If the terms “Software Engineering”, “Artificial Intelligence” or “Machine Learning” sound like Mandarin to you, or if you happen to be my Mom, this article is for you!
Today, we’re going to talk about a very hot topic and try to vulgarize it: Machine Learning. Machine Learning is a complex mix of Maths, Statistics, Computer Science and Engineering. But no need to be a mathematician or genius mad scientist to understand the general idea behind it. Let’s dive in!
Machine Learning is actually a form of Artificial Intelligence. When they hear those two words, people often picture advanced robots gifted with consciousness that are likely to replace humans and take over the World. We can’t blame these people, because that’s how popular culture describes it! But no, that’s not really like that in real life.
Artificial Intelligence, often referred to as AI, actually is a machine able of basic cognitive functions, like learning, problem-solving or pattern-recognition. It has some traits of Human Intelligence, but it’s mostly limited to doing rather simple operations. Computers don’t have feelings, sensations or consciousness (yet…). The robots we already created are a good example: they can learn your name, your age and how to bring you coffee, but they could never feel empathy for you.
There are two types of AI: general and narrow. General AI treats a broader and more complex range of tasks, much like a human brain, whereas narrow AI focuses on one particular task (for example: making coffee). Machine Learning is considered to be a narrow type of AI.
Machine Learning is an expression that is quite self-explanatory: machines, or more accurately models, learning by themselves. But learning what and for which purposes? We’ll focus on the Computer Sciences that use Machine Learning, even though it is important to note that Machine Learning is not necessary a Computer Science, as its mathematical concepts (for example: linear algebra, but we won’t get into that) existed long before the machines we have today.
Machine Learning algorithms build mathematical models in order to achieve the task at hand. Thanks to this, computer programs are now able to recognize faces on pictures, or to recommend us which film we should watch next. This is because Machine Learning trains computers to recognize data and make predictions or take decisions based on the accumulated knowledge they have.
As an example, we can think of a child learning shapes and colors. At first, we will tell this child that a circle is “a circle”, and that the red color is called “red”. We label the information and give it to the child. Later on, after accumulating this knowledge, the child will be able to call a circle “a circle” and red “red”. The equivalence of this method in Machine Learning is Supervised Learning. Like a teacher, we feed the computer a lot of information that we labeled, create a program that makes it learn the information and later on do something with it (identification, sorting, etc…). One common use of this method is the filtering of spam emails, for example. By seeing repeated emails from the same address that is labeled as “spam”, a program can send future emails from this address to the trash folder in our mailbox.
Another method used in Machine Learning is Unsupervised Learning. This method doesn’t label the data. Let’s go back to our teacher/child example. Unlike supervised learning, the child will have to figure out shapes and colors by itself. The best thing they can do is find similarities and differences between different shapes and color. With Machine Learning, it’s the same. We don’t label the data we feed to the computer. Then the computer has to figure out how to sort the data by itself. The advantage of this method is that the program can analyze more complex data than with Supervised Learning. An example of Unsupervised Learning we often see is target online marketing. We don’t necessarily know which type of person buys which type of products, but when a computer program analyses a lot of purchases from a lot of different online stores, it can find out that perhaps sixty-year-old mothers who like dogs also like buying DVDs, for example. Then it will be able to send DVD advertisement to the large group “sixty-year-old mothers”, in an attempt to sell more DVDs.
Now we know a very general definition of Machine Learning, great! But how did we achieve so much automation? How humans are able to “teach” machines how to learn things? Well, for one part we can thank Neural Networks. We generally associate Neural Networks with Machine Learning but they are a separate concept, which is sometimes used (not necessarily) in Machine Learning. This notion shouldn’t be too complicated for those who can remember their basic biology lessons from school.
Usually, a computer program is just a group of statements like “if this happens, do that”, or “keep doing this while that condition is true”, etc. But sometimes if we want to make computer programs learn things, those simple structures are not enough. That’s why Engineers and Mathematicians replicated the best natural tool for learning: a neural network! As a quick reminder, all animals have neurons in their brain that receive electrical stimulation and then pass that stimulation to other neurons and so on. That’s what we commonly call “making connections” in our brain. Although it happens too fast and it’s too small for us to feel it, these connections happen all the time. For example, if it’s raining, then we decide it’s safer to take an umbrella to go out. The complex neural network seating in our brain is responsible for that connection.
In Machine Learning, the model used goes through an artificial neural networks too. Not literal biological neural networks of course, but the code is structure the same way.
Here is a good representation of simple artificial neuron network. Each circle is a neuron, and each column is a layer. There’s the input layer, that represents the data given to the program to analyze, then there’s the hidden layer that decomposes the data depending on different characteristics (for example color or shape), and finally there’s the output data, that will be different of course depending on the application of the program and what type of output we’re looking for. This is an over-simplified version of what a Machine Learning program really looks like, but it’s important to understand those notions before diving deeper into the subject.
Talking about diving deeper, one type or extension of Machine Learning that is very popular and used a lot is Deep Learning. This terms designates a type of AI that uses a neural network composed of many different hidden layers. The more there are hidden layers, the deeper is the learning. Yes, because adding hidden layers means adding levels of repartition of the data. This is probably the form of Machine Learning that is the closest to how a human brain works, and the most impressive. With this type of learning, a computer can achieve voice or face recognition, just like our eyes and our ears can recognizes sounds and shapes.
Deep Learning programs receive a lot of input data, have a lot of hidden layers and are very complex. Therefore, the engineers working on those programs are highly skilled in mathematics and computer science. Its promising results also make it one of the most interesting scientific progress of our times. It is making AI much more stronger than ever before and science more advanced. Its applications are endless: from simple voice recognition to analysis of extraterrestrial data, from robotics to self-driving cars, and so on…
One last subject that is interesting to bring up when talking about Machine Learning is Reinforcement Learning. This type of AI is rooted in Adaptive Control paradigms and reward systems. An adaptive controller is a controller that will adapt to its environment. It’s pretty much the same thing in Reinforcement Learning.
A Reinforcement Learning program is looking for an optimal maximum reward. We can think of a person playing a game or being in a sports competition. The person will look for the maximum reward they can get: the best score, the best prize, the first position, etc. Well for computer programs using Reinforcement Learning, it’s the same. The program will take decisions and get a reward. Based on that reward it will know what it needs to do differently to get a better reward, and so on until it reaches the optimal result.
In programming, this concept is a mix of exploration and exploitation notions. Like we saw earlier, in Supervised Learning a program knows what data is what and therefore can exploit the results to achieve the goal we set for it. Exploration would be closer to Unsupervised Learning: taking risks, learning from successes and failures, then getting the maximum reward possible.
Reinforcement Learning programs can use deep neural networks (aka with many hidden layers) to get the maximum reward possible. One use of this type of Machine Learning could be a computer player in a video game. For example, One of the first use of Deep Reinforcement Learning was building an autonomous player for Atari games. This was achieved by the company DeepMind in 2016. Since then, a lot of very famous tech companies like Google or Facebook have started implementing Deep Reinforcement Learning.
As fascinating and complex as Machine Learning may seem from an outside viewpoint, its idea is pretty simple: build models able to learn by themselves. Models which can result in computers that are more and more autonomous and powerful, to serve more and more needs in a large number of fields. Now you’ll never say you don’t understand what a relative does when they say they do “Machine Learning”!