Introduction to Machine Learning

A beginner’s guide to ML

Thisuri Bandaranayake
LinkIT
6 min readMay 21, 2020

--

Photo by geralt on Pixabay

Machine Learning is a very familiar word in the world of technology. It is one of the most interesting subfields of computer science. But to a beginner, this word will not mean so much. So to begin the journey of Machine Learning(ML) let’s first get to know what is ML and what can be done with it.

What is Machine Learning (ML)?

Artificial Intelligence or AI is mainly divided into two broad areas as Machine Learning and Artificial Cognitive Systems. These two areas are inspired by the behaviour of the human brain. The human brain can achieve high capabilities through practice and through symbolic reasoning. ML is inspired by the training capability of the human brain. Therefore ML is about training by a large number of examples even without knowing any theory behind it. Same way Artificial Cognitive Systems are inspired by the symbolic reasoning ability of the human brain. There are two main definitions of machine learning as follows.

“The field of study that gives computers the ability to learn without being explicitly programmed.”- Arthur Samuel

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”- Tom Mitchell

From the above two definitions the definition by Tom Mitchell is a bit complex but still gives a modern definition. However, in simple words ML is about making a machine to learn using a given data set just like humans learning things through practice.

How does Machine Learning Work?

Let’s consider the way students studying for an exam. Before the exam, they read notes, refer videos and lectures, and do many questions and prepare themselves for every type of question that could be given. That way the students feed their brains with a good amount of valid data. They train the brain with knowledge so that they can easily find the approach or the logic that is needed to solve a given question. The same thing is done in machine learning. That is the way models are built-in machine learning. In machine learning, a large data set is used to determine the parameters of the model. Once all the parameters are determined correctly through training the model, we consider the model as a trained model. Then this trained model is used to find the output of a given input.

Machine Learning Vs Traditional programming

Machine learning Vs traditional programming

In Traditional programming, we develop the program or the logic. When the input is given the program runs in the machine and gives the output according to the logic. But in Machine Learning we feed the machine with data (Both input and output) during training and the machine creates the logic or the program by its own. That logic is evaluated during testing.

Some Applications of Machine learning

  1. Predictions
  2. Speech recognition
  3. Image recognition
  4. Medical diagnosis
  5. Video surveillance
  6. Social media services
  7. Malware filtering
  8. Online customer guidance
  9. Online recommendations
  10. Fraud detection etc

In general ML problems can be divided into three categories as Supervised Learning, Unsupervised Learning, and Semi-Supervised Learning.

What is Supervised Learning?

When the machine is trained with labelled data (The data having both the input and output parameters) it is known as supervised learning. Here we know both the input and the output in the data set and the algorithm learns from the training data set until it achieves an acceptable level. This is just like a teacher supervising a student in the learning process. Therefore it is known as supervised learning. In supervised learning, we predict a specific quantity or a label. Regression and classification are examples of supervised learning techniques. Some popular examples of supervised learning algorithms are Linear regression, Random forest, and Support vector machine.

What is Unsupervised Learning?

When the machine is trained with data that are not labelled (The training data set has only input data and no idea about the output) it is called Unsupervised learning. This allows the model to act on the information without guidance. In this learning approach, there is no correct answer just like there is no teacher. So that it is called unsupervised learning. Here the task of the machine is to group data based on similarities, differences, and patterns. Clustering and association are unsupervised learning techniques. Some examples of unsupervised learning algorithms are K-means clustering and Apriori algorithm.

What is Semi-Supervised learning?

When a machine is trained with a data set where some data are labelled and some are not labelled (a training data set where some output data are missing), we use semi-supervised learning approach. In this approach unsupervised learning is applied first and the data are separated into groups and then the supervised learning approach is used to give labels to these groupings.

What are the main ML techniques?

  1. Classification: The machine learns from the input data and gives the output of the observation as a category or a class. This is a supervised learning technique. The data set used can be simply bi-class or multi-class. Based on the number of classes in the data set after training the model the observations can be categorized into the relevant class. Some examples are, identifying a mail as spam or not and classifying a person as male or female.
Classifying emails as spam and non-spam

2. Regression: The relationship between the input variables and the output is expressed as a linear or non-linear equation. Here the output is a value, not a label or a category. This is also a supervised learning technique where this can be only applied to labelled data. Regression is also called a curve fitting technique. In this approach as same as in other techniques, we need to find out the best-fitted model or the function that gives a more accurate output.

An example of a linear model with only one input variable

3. Clustering: The input (data set) to the machine is grouped into a set of clusters without knowing the clusters beforehand. This is an unsupervised learning approach that deals with unlabeled data. Here the machine has to discover a pattern to group the data into clusters. The data points belonging to the same cluster are homogeneous. An example is grouping the customers by their purchasing patterns.

A simple example of clustering

To decide which technique is suitable for a particular problem, we need to carefully analyze the objectives of the analysis of the data set in a way that can give the best solution to the addressed problem.

This is a simple and brief introduction to machine learning. I hope this helped you to get a brief idea about machine learning. This is just the beginning and there is a long way to go.

Cheers!!!!

--

--

Thisuri Bandaranayake
LinkIT

Undergraduate at Faculty of IT University of Moratuwa