Day One in Machine Learning

Published in

The Startup

5 min readJan 5, 2021

The whole concept of machine learning is figuring out ways in which we can teach a computer to perform a task without needing to provide explicit instructions.

Fig.1 Classify Wine Brand using Machine Learning.

For example, we might want to instruct a machine to recognise the brand of a bottle of wine. The first step would be to enter a wine shop with some lab tools and write down wine characteristics for all the bottles of wine. At the end of this process, we will have a list of information for each bottle annotated, including the details (features such as colour or alcohol content) and the brand (label) as in figure 1.

After collecting the details of the wines and their label, we will analyse the collected data. Do we have enough examples for each category of wine? Have we collected useful information to differentiate one wine from another? Without the help of an expert, graphs often help us understand if we need more data or extrapolate new knowledge from what we have got (Fig. 2).

To verify that the machine has learned how to catalogue wines autonomously, we must divide the lessons into training and testing (fig. 3). During the training phase, the model will calculate and memorise the rules for establishing the type of wine using a subset of the data with the wines’ details and the right label. Next, we will have to test that the memorised rules work in general. We will feed to the model the wines’ details of the other subset of data that the machine has not seen during training. We can verify the model’s predictions against the correct label and monitor the real accuracy of the model.

Assuming good accuracy during the test, the model can generate predictions on wine bottles and automatically catalogue them. In practice, we will have to feed to the model the wine characteristics (for instance: colour type and alcohol content fig. 4) and get back the wine brand name.

Fig. 5 Machine Learning vs Traditional programming

So a machine learning model like this would help us to classify a wine bottle, which is an alternative classification method compared to traditional programming. In traditional programming, we would need to come up with some rules by ourself and feed the data to the rules to get the desired answer. In machine learning, the solution is reversed. We need to collect a dataset with a list of data point and the associated correct answer. The machine learning model will learn the rules by itself to generate the correct answer (fig. 5).

Let’s examine the question in figure 6. Here we want to classify a data point as female or male given the name and the customer code. It is a simple example and can reveal some pros and cons of machine learning solution compared to a traditional programming implementation. If we start to perform a bit of exploratory data analysis (EDA) we can find out that from the training data it looks that each female target is often associated with a customer code of even length. So maybe a simple solution would be to create a programme that checks if the length of the customer code is even or odds to predict female or male. But what if the testing dataset doesn’t present the same pattern? It can be that we have too few training samples and the pattern is not representative (this is usually referred to the notion of garbage in garbage out in machine learning). We can also extract other features from the data. For instance, we can consider the number of letters of each name and if the name ends with a vowel. Soon a traditional programming approach can become difficult to handle and maybe a machine learning solution can be the solution to model the data and the noise in the data.

Even though the features we created during the EDA might not be useful for a traditional programming approach we can still use them to improve the performance of the machine learning model (see fig. 7, which is a practice known as feature engineering). A machine-learning algorithm could in this case correctly label “Cod” as male even though the length of customer code is odds because it will use the other features to generalise over the noise in the data (other features: not ending with a vowel and a small number of letters).

There are issues with a traditional programming solution that often is solved by a machine learning model. The logic required to make a decision is specific to a single domain and task. Changing the task even slightly might require a rewrite of the whole system. Designing rules requires a deep understanding of how a decision should be made by a human expert, that can be challenging to implement. With machine learning, we can potentially use the data and get the prediction without a complex coded solution, but still, understanding the data and the domain knowledge can let the data scientist curate a better machine learning model. Day two in Machine Learning here.

Day One in Machine Learning

Written by Guido Salimbeni