Machine Learning Fundamentals

Chaitali Thakkar
CodeX
Published in
8 min readSep 10, 2021
(Source: https://images.squarespace-cdn.com/content/v1/5daddb33ee92bf44231c2fef/1586994429139-7FQY217XE8ZT7N4Z9QLH/bias-in-machine-learning.jpg?format=1000w)

Machine Learning — A term that is really razzmatazzed and which is tremendously written about these days. This article unravels the fundamentals of Machine Learning in the most unambiguous manner with its applications.

How can we define Machine Learning?

Machine Learning is exactly how it sounds. A machine, learning something. It is a process of learning, that begins with observing the data feeded into that particular machine, and the machine looks out for patterns in that data, so as to make decisions using it.

Its central aim is to allow the machines/computers to learn instinctively just by using certain algorthims, without any kind of human intervention or succour and adjust its methods accordingly so as to fit in the model.

In brief, Machine Learning is the study of computer algorithms, that can improve automatically through the use of data.

Pre-requisite Terms you should know:

  1. Training Dataset: It is the set of data which is used to fit the model which is used to train the dataset. It helps the machine learn the data and how it can be further used.
  2. Validation Dataset: It is the sample of data used to provide an unbiased evaluation of the model fit on the training dataset, while tuning the model hyperparametres ( a parameter whose value is used to control the learning process). It is also called the development set.
  3. Testing Dataset: It is the set of data which is independent of the training set and is used to provide an unbaised evaluation of the final model fit on the training dataset.
  4. Labelled Data: Labeled data is a group of samples that have been tagged with one or more labels.

Approaches of Machine Learning:

The Machine Learning approachesare broadly classified into four categories, which are as follows:

  1. Supervised Learning: It is one of the most common and easy to use algorithm; the machines are trained using well labelled training data.
  2. Unsupervised Learning: It is the type of machine learning algorithm, in which the models are trained using an unlabelled dataset and are allowed to act upon the data without any external supervision.
  3. Semi-supervised Learning: It is the type of machine learning algorithm, which consists a small portion of labelled data, and a large portion of unlabelled data from which the model makes the required predictions.
  4. Reinforcement Learning: It is a unique type of machine learning algorithm which rewards the desired outputs and punishes on mistakes, so as to train the machine.

Steps involved in implementing a Machine Learning Model:

  1. Determine the type of training dataset: Check if it is numerical data, categorical data, time-series data, or text-based data. Also check how many dependent and independent variables are present.
  2. Collect the labelled training data: Identify what your target is and what the features are to get the identifying patterns and predicting the targetted data.
  3. Splitting the dataset: We need to split the entie dataset into two or three parts based on the requirements — training set, validation set, and testing set.
  4. Determine the input features of training data sets: Understand the features of the given dataset.
  5. Determine the suitable algorithm to train the dataset: After understanding the requirments, we need to decide upon the most appropriate dataset for our dataset — regression/classification, clustering/association, decision tree, etc.
  6. Execute the algorithm on the training dataset: Apply the decided algorithm on the data to train the set.
  7. Evaluate accuracy of the model: By providing the testing data, we can understand and evaluate how accurate our model is.

Supervised Learning:

(Source: https://www.tutorialandexample.com/wp-content/uploads/2020/11/Supervised-Machine-Learning-1.png)

Supervised Learning is a type of machine learning technique in which the machines are trained using well labelled data. Its main objective is to find a mapping function to map the input variable along with the output variable.

Supervised Learning can be divided into 2Univariate Supervised Learning and Multivariate Supervised Learning.

Univariate Supervised Learning comprises of 1 dependent variable and 1 independent variable.

Multivariate Supervised Learning comprises of 1 dependent variable and more than 1 independent variable.

Types of Supervised Learning Algorithms:

Regression Analysis: Regression Analysis is a statistical method which is used if there is a relationship between input variable and output variable used for continuous variables. There are multiples types of regression analysis — Linear Regression, Regression Trees, Non-linear Regression, Bayessian Linear Regression, Polynomial Regression.

Classification: The classification method is used when the output variable is categorical; i.e., there are only 2 solutions. The types of Classification algorithms are — Logistic Regression, Support Vector Machines, K-Nearest Neighbours, Kernel SVM, Naïve Bayes, Decision Tree Classification, Random Forest Classification.

Advantages of Supervised Learning:

  • The supervised learning model can predict the output on the basis the model is trained using the training set.
  • We have an exact idea of the classes we are working with since they are labelled.
  • It helps us resolve real world issues.

Disadvantages of Supervised Learning:

  • It isn’t suitable for complex datasets.
  • It cannot predict the correct output if the testing data is different from the training data.
  • A lot of computation is required for training the model.

Real Life Applications of Supervised Learning:

  • Image Recognition: Image recognition is one of the most important examples of supervised machine learning. It detects the patterns between images and make suitable predictions. This can be used mainly for security and medical purposes.
  • Speech Recognition: Speech Recognition is the method where we can convert our spoken words into readable text. Our most famous voice assistants, such as Siri and Alexa, use this application.

Unsupervised Learning:

(Source: https://bigdata-madesimple.com/wp-content/uploads/2018/02/Machine-Learning-Explained2.png)

Unsupervised Machine Learning is a type of machine leaning technique, in which the models are trained using unlabelled dataset and are allowed to work and act upon this dataset without any external human intervention or supervision.

Unsupervised learning is greatly helpful for finding useful insights from data. It is very much familiar to the way humans learn to think through their own experiances. This method even works when the input data does not correspond with the output data.

Types of Unsupervised Learning Algorithms:

Clustering: Clustering is the method of grouping objects into clusters in such a manner that object with the most resemblence remain in a particular group and the one with greater dissimilarities belong to another group.

Association: Association is the method of finding relationships between variables in a large database. We get to know how one variable can directly or indirectly be associated to another variable.

The different types of agorithms are — K means clustering, K Nearest Neighbour, Hierarchical Analysis, Anamoly Analysis, Neural Networks, Principle Component Analysis, Independent Component Analysis, etc.

Advantages of Unsupervised Learning:

  • Unsupervised Learning can be used for solving complex tasks.
  • We get a greater freedom to explore the data present.

Disadvantages of Unsupervised Learning:

  • It is difficult to work upon.
  • The accuracy of the predictions maybe less.

Real Life Applications of Unsupervised Learning:

  • Audiance Segmentation: This application segregates the audiances based on their choices. The OTT platforms give us recommendations based on our previous choices.
  • Inventory management: Inventory management is the application which stores use, where we can find a connect between certain products through association.

Semi-supervised Learning:

(Source: https://cdn-images-1.medium.com/max/1600/1*XkhJYY6-oRYGq6s0Txfj1g.png )

Semi-supervised learning is the type of Machine Learning approach in which a small part of labelled and a large part of unlabelled data is used to train the model.

Advantages of Semi-supervised Learning:

  • The algorithm used for semi-supervised learning is stable in nature.
  • It is highly efficient in nature.

Disadvantages of Semi-supervised Learning:

  • The accuracy may not be high.
  • The iteration results are not very stable.

Real Life Applications of Semi-supervised Learning:

  • Speech Analysis: Labelling of audio files is a very intensive task, where only some of the features may be labelled, and it requires a lot of human intervention.
  • Web Content Classification: Content present online needs to be classified based on our searches and the keywords present.

Reinforcement Learning:

(Source: https://cdn.datafloq.com/cms/2018/01/23/reinforcement-learning.png)

Reinforcement Learning is the type of Machine Learning approach is about making decisions sequentially. The output depends upon the state of the current input and the next input depending upon the output of the previews of the input. The decision made is completely dependent, hence the labels are given to the sequences of the dependent decisions.

Types of Reinforcement Learning:

Positive Reinforcement: Positive reinforcement is when an event occurs due to a particular behaviour which increases the frequency of that particular behaviour occuring. It also maximizes the performance, and sustains changes for a prolonged period of time.

Negative Reinforcement: Negative reinforcement is defined as the strengtheningof the behaviour of the model as the negative condition is avoided. It increases behaviour, it also provides subversion to minimum standard performance.

Advantages of Reinforcement Learning:

  • It can be used to solve very complex problems.
  • It achieves long-term results.

Disadvantages of Reinforcement Learning:

  • An excess of reinforcement learning can lead to overloading and hence diminish results.
  • It is not preferable for solving simple problems.

Real Life Applications of Reinforcement Learning:

  • Gaming: The single-player games that we play requires our device to make the next move, and the move will be based on our previous move. Chess, Ludo, UNO, etc. requires reinforcement learning.
  • Stock predictions: The stock predictions require reinforcement learning to understand the market and its upcomings.

--

--

Chaitali Thakkar
CodeX
Writer for

Trying to satisfy my technical and entrepreneurial bug !