Learning Machine Learning — Introduction

A guide to learn Machine Learning by implementing Machine Learning

One of the biggest challenge of Machine Learning is actually learning Machine Learning. As a matter of fact, many skills, tools and science fields are required to understand the most common ML use cases. This is what makes it at the same time a very exciting but challenging topic. Because there is no other way to learn than practice, the following series of articles aims to provide you with illustrated examples and code samples to build the analytical approach required to understand and master Machine Learning.

Machine Learning implementation is actually based on a very structured series of steps that should be followed every time to build a successful Machine Learning model. Because they are pillars of the Data Science Process, let’s name it as follow: Data Preparation, Training, Scoring, Evaluation and Prediction. During these steps, previously mentioned tools & skills will be introduced at the time needed to leverage the right notion when it makes more sense. This should let the reader go progressively into Machine Learning bit & bytes and seamlessly allow to expand its skills with appropriate knowledge.

First off, I guess that if you are reading this, there is a simple but powerful question burning your lips : What is Machine Learning ?

We’ll go extensively through all of it, but as a warm up, a short answer to that is: Train a Model with Data to make Predictions.

Now, before starting any activity, there are 2 fundamental Machine Learning aspects any reader should be aware of:

  • Supervised & Unsupervised Learning: this notion is related to the nature of the Dataset and Training model Data Scientist will be using. In short, Supervised Learning leverages a Label in the Data for Training, whereas Unsupervised Learning has no Label to rely on during Training phase.
  • ML Problem Categories: although Machine Learning can apply to lot of different use cases, most common scenario you will hear about fall into the following categories: Classification, Regression, Anomaly Detection, Natural Language Processing (NLP), Recommender, Clustering.

Because we want to learn, we’ll start simple with Supervised Learning and common Machine Learning problem like Classification and Regression.

After this short introduction, let’s go into the wild with the first step into Learning Machine Learning, a guide to learn Machine Learning by programing. The next articles will leverage Python and sci-kit learn modules, which are best-in-class environment to start Learning Machine Learning. A set of common others modules like matplotlib, numpy and pandas will be used as well, as they are at the foundation of the Data Science Process steps. In order to make it simple and successful for everyone to follow these tutorials, a list of simple pre-requesites will be mentioned before any programing activity.

Let’s go !