Machine Learning Guide for Everyone: Introduction

Vaishnavi Ajmera
VLearn Together
Published in
5 min readJun 11, 2020

We are entering the new age of technology. The new emerging technologies of machine learning, deep learning, artificial intelligence are reaching at the nexus of their capabilities. In a few years, we will have artificial intelligence assistants to help us in every aspect of our lives.

So, keeping this in mind I am presenting a guide of machine learning for everyone. The main objectives of this comprehensive series are —

  1. Creating a guide which helps readers to easily understand the different concepts of machine learning and get a good idea about it.
  2. Leaving a digital footprint of my learnings and experiences in this subject which will inspire others as well as me to learn more about machine learning.

This is the first article of the series in which I will discuss what is machine learning and give a brief introduction to basic machine learning algorithms.

What is Machine Learning?

Machine Learning is the field of study that is concerned about equipping computers with the capability to learn without being explicitly programmed. It is the methodology to find predictions from the collections of examples i.e., data and help us make better decisions.

Machine learning has become a widely available technology nowadays. It has given us, the power to predict various things with the help of computers.

There are several examples of machine learning problems such as — “what is the price of this house?”, “is email spam or not?”, “will this person buy this item?”, “which of these is cheese?”, “what did you say?”, “person is sick or not?”, “what will be the temperature tomorrow?”, etc. Many of the problems are solved by applying machine learning algorithms and we are trying to solve more.

Among various crucial tasks of machine learning, the classical machine learning algorithms are mainly classified into two types -

  1. Supervised Learning
  2. Unsupervised Learning

The main key difference between these two types of algorithms is we can say like one has a teacher and the other one doesn’t have one. In supervised learning, the teacher provides the results which he has observed with examples while in unsupervised learning there is no teacher only examples are provided.

Supervised Learning

Supervised Learning refers to that kind of learning in which dataset is the collection of labelled examples. The provided data is processed first to make it in machine-readable form and the desired features of the data are used. After this, the model learns from these examples with already known outcomes and make modifications of their inner parameters to adapt themselves according to the input dataset.

On the basis of the type of result — values or categories, the supervised learning algorithms are classified into two main categories — Regression and Classification.

1.Regression

Regression is the type of supervised learning where we predict the number or value from the given different predictor variables and a continuous response variable. The values or quantitative variables can be continuous (can take any value within the interval) or discrete (countable values, a finite number of values). Examples of regression can be predicting car price by its mileage, house price based on the number of rooms, plot size, etc., the temperature of the next day and many more. Regression is perfect when something depends on time.

In regression, the machine tries to draw a smooth curve that indicates the correlation between different variables. It can be classified as- Linear Regression when the line is straight and Non-Linear/ Polynomial Regression when the line is curved.

2.Classification

Classification is the type of supervised learning where the goal is to predict the categorical class labels. Class labels or categorical variables can be ordinal (have some order) or nominal (no ranks, only names). It splits the objects based on different attributes. It can be used for classifying such as- music by genre, spam email or not, fraud detection, different species, sentiment analysis, language detection, etc.

It can also be classified as — Binary classification- spam mail or not and Multiclass classification- classifying handwritten digits(0 to 9) i.e. 10 classes.

Popular classification algorithms are — Logistic Regression, Naive Bayes, Decision Trees, K-Nearest Neighbours, Support Vector Machines.

Unsupervised Learning

Unsupervised Learning refers to that kind of learning in which dataset is the collection of unlabelled examples. The main goal of unsupervised learning is to do exploratory data analysis i.e., explore the structure of the data and extract some meaningful outcome from it, without the reference of any result. Unsupervised learning has more complex algorithms as compared to the supervised.

Unsupervised Learning has three main categories — Clustering, Dimensionality Reduction, Association Rule Mining.

1.Clustering

Clustering is one of the most popular and widely used unsupervised learning techniques. It is the grouping of data points or objects that are somehow similar based on the given features. Clustering is classifying without predefined classes. The various applications of clustering are- Market Segmentation, Clustering genetic markers to identify family ties, to analyze and label new data, etc.

Why clustering? — There are many reasons for it like- Exploratory Data Analysis, Summary Generation, Outlier Detection, Finding Duplicates, Pre-processing Step.

Types of clustering algorithms- K-means clustering, Hierarchical Clustering, Mean-Shift, DBSCAN.

2.Dimensionality Reduction

In reality, we have to work with the datasets which have a high number of features, in other words, high dimensionality. So this increases the computation time and decreases the performance of the model. So to deal with the issue we use Dimensionality Reduction.

It works by finding correlations between features and removing redundant information and then assembling specific features into high-level ones. It also helps in removing the noise from the data. It is used in- Recommender Systems, Fake image analysis, etc.

Different algorithms are- Principal Component Analysis, Non-negative Matrix Factorization, Latent Dirichlet Allocation, Linear Discriminated Analysis.

3.Association Rule Mining

Association Rule Mining includes the methods and techniques to analyze customer buying detail, automating market strategy.

In this, we try to find the patterns in buying trends of the customer and then suggest them the items they are more likely to buy.

In this article, we have learned about what is machine learning and also had some introduction of various machine learning algorithms. As stated above, this is the first article of the series. We will be having more of these in which we will learn some deep concepts and new things related to Machine Learning.

Stay Tuned! Happy Learning!

--

--