Getting Familiar to The World of Machine Learning

Bird's eye view of Machine Learning

Published in

CodeX

8 min readJan 7, 2022

Note : This is the first blog in the “Complete Machine Learning and Deep Learning for Beginners” series, the main audience for this series is anyone who wants to learn more about Machine Learning and Deep Learning or refine their ML/DL skills.

This Blog’s outline

Introduction

Short History of Machine Learning
Future of AI

2. Types of Machine Learning

Supervised
Unsupervised
Reinforcement

3. Importance of Relevance of Data

4. Steps every Machine Learning model goes through

Data preprocessing
Exploratory Data Analysis
Model Training
Model Evaluation
Model Deployment

5. Conclusion

This is just an introductory blog. If you face difficulty understanding this initially, stay with us, it will become clearer as we move forward in this series.

Introduction

Regression, Data preprocessing, RandomForest, Naive Bayes Classifiers all these terms are a bit intimidating to all when we first hear about them and leave some people wondering if they would be able to grasp these at first but as you move further with this series of topics it will start becoming clearer and easier, this series is to help you get rid of all the intimidation and move ahead on your Machine Learning journey with us.

Short History of Machine Learning

It really surprised me when I learned that the term “Machine Learning” was first coined in the year 1959!!! The work in the field of Machine Learning has been rising since then. There have been continuous developments in algorithms for both Machine Learning and Deep Learning. However, it was impossible to make computers work on them due to the lack of Computational Power. However, since the last decade, as computers advanced, they opened up many doors to test and evaluate and discover new findings in this exciting field.

Future of AI

Apart from a debatable topic of how AI will turn the machines against humans in the future, there are a lot more exciting things we will see in the future which will be empowered by AI, Self Driving cars that are almost here among us, Amazing robots which make complex industrial tasks really easy, restoring old photographs to new, using AI to create amazing art, automated healthcare, empowering the people who have difficulty in speaking, etc. and thus it also opens up great career opportunities for practitioners.

Types of Machine Learning

The best way to cover up a big topic is to dissect the topics into smaller pieces and then learn each one of them in the world of Machine Learning; the dissected pieces will be :

Supervised Machine Learning
Unsupervised Machine Learning
Reinforcement Learning

Let's dive into a brief introduction and example of these

1. Supervised Machine Learning

A bookish definition — “Supervised Machine Learning is a type of Machine Learning where we train machines using a well “Labelled” Dataset”.

Ummmm…… Okay…….., Let us try to understand this with an example.

Suppose in a pet shop you have many cats, dogs and rabbits. You have the dataset of the shop now in each data point or data instance; you will have some features of the data like the height, colour, type of tail etc., as a column of the dataset. There will be a specific column of interest that we want to train our machine on. This column’s values serve as a label for our data.

Check out the above table, now; in this table, we have 3 data points and four informative columns now based on what we are trying to do with the dataset, we will select our label column, suppose we want to predict which category does the data point belong to then “Category” will become our labelled column and the other columns will work as features and help us in predicting the Category.

Similarly, if you want to predict the tail type of the data point, based on this use case, the column named “Tail type” will serve as the labelled data column, which we will try to predict, taking other columns as features.

So when we train a machine to predict something in this way, it falls under the category of Supervised Machine Learning There are different ways to train a model but training a model will be discussed later in this series.

2. Unsupervised Machine Learning

According to books — “A type of Machine Learning in which we use unlabelled dataset for training purpose and machine is allowed to act on the data without any supervision.”

Okay….. So What??, Let's learn this our way again.

Suppose you want to train the machine over a dataset with no column of interest; we do not want to predict anything. It is just data, and we want to learn how the data is behaving in itself what we can learn from data. In unsupervised, machine learning is much similar to machine learning using the given dataset as humans do fascinating.

Suppose we have a dataset that looks something like this.

This is the data of categories of flowers, and each data point corresponds to one of these flower categories. These columns specify the value like sepal width of flower, petal length of flower etc. However, we do not know which data point belongs to which category; how would we do that?

One of the best things to consider will be how the data point relates to what data instances are forming a cluster, so while performing the unsupervised learning on this dataset, we train our machine to learn from this data; here is the result.

So, when we performed clustering over this data, we can see that 3 cluster centres correspond to each of the flower categories, and we can say that yes, there are three categories of flowers covered in this dataset; now, suppose we get a new data instance, and we want to know which flower category it closely resembles. So, it will search which culture the data most closely relates to, and that cluster will be the category that the data point belongs to.

3. Reinforcement Learning

Suppose you meet a monkey, you decide to give him a banana every time he makes a flip guess what he will do the next time he sees you. Yup, you guessed it right, he will jump and expect to get that banana. Now think of a cat drinking milk, and you keep the glass in front of it now if the cat pushes glass, then you snatch away the milk, and you do this for a few times guess what the cat will do if a glass is kept in front of her yup the cat will not push the glass because that will result in snatching milk away from her.

The above two examples tell about the reward or the punishment one gets if certain actions are performed in a certain environment. For the cat, the environment was the presence of milk and glass. For the monkey, the environment was you and the banana. Reward and punishment train the animal to do a certain action or avoid certain activities in a specific environment. This is known as reinforcement learning, and similar things can be performed with the machine.

Look at this famous Pacman example

source -> https://towardsdatascience.com/deep-q-network-combining-deep-reinforcement-learning-a5616bcfc207

The importance of relevance of Data

In the world of Machine Learning, everything depends on the data. Your data is the one thing that will influence the robustness and validity of your trained machine learning model like no other thing. If you go wrong in picking, selecting or collecting data, you can be sure that the final result of whatever you do will not be up to the mark, So the suitability of the data is the primary most important thing you need to take care of. There is nothing as absolutely right data; it is just the relevancy of the data to your case that matters.

Suppose you want to predict the price of a house using some datasets so if you have a dataset which has feature columns as (house area, no of. rooms, locality, surrounding, electricity) and the other dataset with feature columns as (house area, no. of rooms, locality, surrounding, electricity, availability of water) so all these things contribute to the price of a house but the second dataset is better than the first one. Hence, it is safer to say that dataset 2 is better to move forward with our use case.

Steps every Machine Learning model takes :

Data Preprocessing
Exploratory Data Analysis
Model training
Model evaluation
Model Deployment

These are just brief introductions to the topics. They all will be covered in upcoming blogs.

Data Preprocessing

Once you have the right data, you can proceed with Data Preprocessing. Data Preprocessing deals with making the data more suitable for further processes so that the machine can learn with more refined data. Like taking care of the data points where the column values are missing, or the data point is not complete.

2. Exploratory Data Analysis

We try to understand the data visually and see the patterns and how the different columns are related to each other visually. Based on that, we prioritize different columns as per understanding.

3. Model Training

In this step, we try and test different methodologies of training our machine on our data and select a model that gives us the best performance, among others.

4. Model Evaluation

To evaluate which model is the best for our use case, we test the model on different parameters and the model which has the best result is selected for final use.

5. Model Deployment

Now we have trained our Machine on our data, now to make it useable, we need to deploy it, that means to keep it somewhere (generally cloud) so that it can be used effectively and easily by the user.

Note: It is not necessary to perform all these 5 steps sequentially every time, we can switch between them back and forth depending on the use case and situation and our place in the project lifecycle.

Conclusion

This blog introduced you to the world of Machine Learning, how Machine Learning is further divided into different subcategories, why the relevance of data is so important, and what steps need to be taken to build a good Machine Learning model. This is just an introductory blog which is the first of its series. In the upcoming blogs, we will further dive into each of these topics in deep, and you will learn the building Machine Learning model and further improve on them.

In the next blog, we will be learning about the prerequisite maths required for Machine Learning.

Next Blog in the Series : Foundation of Machine learning — 1