Fundamentals Of Machine Learning part 1

AKSHAY KUMAR RAY
Mar 10 · 4 min read

What is machine learning ?

  • Online /shopping recommendation for movies/stuffs on Netflix or Amazon? Answer is Machine Learning.
  • Fraud Detection ? Answer is Machine Learning.
  • self-driving cars ? Answer is Machine Learning
  • Bank Loan Approval ? Answer is Machine Learning.

We don’t realize but we are living in a time where machine learning is doing all the heavy lifting. We are slowly becoming a great consumer of machine learning products . Whether it is shopping , movie recommendation , health related problems. Machine learning is becoming crucial part of our day to day life . As the name suggests the ability to learn to mimic human task is called machine learning . In more technical terms these are the two different variation of its definition.

Field of study that gives computer the ability to learn without being explicitly programmed.

A computer program is said to be learn from experience “E” with respect to some task “T” and some performance measure “P” , if its performance on T as measured by P , improves with experience E.

Why are we leveraging on machine learning to do our work?

One machine learning algorithm can outperform an algorithm that is derived from lot of hand tuning and set of rules. You can also find solution to complex problems for which normal paradigm of programming fails to give a good solution.

Machine learning systems can adapt to new data.

Machine learning systems are scalable and can work on terabytes of data.

Types of Machine Learning Methods:-

Machine learning systems can be classified on kind of supervision they get during training. There are three major categories.

In this type of learning you feed the desire solutions also knows as labels to the algorithm. In more technical terms it is a set of (X,Y) where X is the “input” or “features” and Y is the “output” or “target variable” where you use an algorithm to learn the mapping function from the input to the output. Y=f(X). Real life Examples:-

  • Market prediction
  • Image classification

A typical supervised learning can be used for :-

  • Regression Task:- The outputs or values of Y are continuous values . Market prediction is an example of regression task.
Regression model :- on the x axis we have features and on the y axis we have continuous values.

Algorithms for regression task:-

  1. Linear Regression
  2. Support Vector Machines(SVM)(SVR For Regression)
  3. Decision Tree
  4. Random Forest Trees
  5. Naive Bayes
  6. Lasso Regression
  7. K- Nearest Neighbor (KNN)
  • Classification Task:- The outputs or values of Y are discrete values. If there are two discrete values we call it “binary class” if there are multiple discrete values we call it “multi class”. Spam classifier , sentiment classification are some example of it.
Binary class :- Class A and Class B are two classes that are separated by a line called “Decision Boundary”.

Algorithms for classification task:-

  1. Logistic Regression
  2. Support Vector Machine(SVM)(SVC For Classification)
  3. Decision Tree
  4. Random Forest Classifier
  5. Naive Bayes Classifier
  6. K- Nearest Neighbor (KNN)
  7. Stochastic Gradient Descent

Unsupervised Learning:-

As the name suggests the machine is not supervised and the label is guessed. The system tries to learn it without a teacher. Instead, it allows the model to work on its own to discover patterns and information that was previously undetected.

Unsupervised Learning Algorithms allow users to perform more complex processing tasks compared to supervised learning. Although, unsupervised learning can be more unpredictable compared with other natural learning methods. Unsupervised learning algorithms include clustering, anomaly detection.

Clustering is an important concept when it comes to unsupervised learning. It mainly deals with finding a structure or pattern in a collection of uncategorized data. Clustering algorithms will process your data and find natural clusters(groups) if they exist in the data.

Algorithms used for clustering:-

  1. K-means
  2. DBSCAN
  3. HIERARICHAL CLUSTERING ANALYSIS(HCA)

Dimensionality reduction is the process of reducing the number of random variables under consideration, by obtaining a set of principal variables. It can be divided into feature selection and feature extraction. It helps to select features so that we can build more robust models.

Algorithms used are:-

  1. PCA(Principal Component Analysis)
  2. LLE(Local Linear Embedding
  3. T-distributed Stochastic Neighbor Embedding (t-SNE)

Association rule learning is a type of unsupervised learning technique that checks for the dependency of one data item on another data item and maps accordingly so that it can be more profitable. It tries to find some interesting relations or associations among the variables of dataset. It is based on different rules to discover the interesting relations between variables in the database.

  1. Apriori
  2. Eclat

Reinforcement Learning:-

It is a very different variation of Machine Learning . It works on the principle of reward and punishment. If the machine does a good job it is rewarded and gets penalized if it doesn’t.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…