Layman’s answers for “Machine Learning” Questions of beginners

Kayk Waalk
9 min readDec 7, 2018

I would like to answer some of basic questions on ‘Machine Learning’ that are frequently asked by Machine Learning Beginners. Of course, When I was beginner I too had similar questions, and they motivated me a lot to explore the subject & respective applications.

Please be patient, and read through all the questions in this page— I believe that, for sure, you will get a gist of the concept of “Machine Learning”.

1. What is “Machine Learning” ?

Yes Machines (i.e computers) can learn. With this statement, these questions can come up — What do they learn? How do they learn?, I will discuss these questions too in this explanation.

To start with — Assuming that you are a software developer, and have been asked to develop an application that takes weight & height of a person (let’s call these attributes as features)as input and tells whether the respective person has to visit the doctor or not. (Note that you are predicting a value for “visit” here, the value can be either “yes” or “no”)

Yes, as an application developer it’s not a challenging task to you, you will form “some rules” such a way that, as an example, if the Weight ≤ SomeValue and Height ≤SomeValue, then recommend the person to visit to the doctor. If the application outputs “yes” mean visit the doctor, “no” means need not to visit the doctor.

That’s good, but to make the application better, your business lead asked you to consider some more features like gender, age, eye sight, body temperature, number of minutes of exercise per day as input. As per requirement, you modified your application by forming “more rules” based on these features. This time your application is making better recommendations.

That’s better, but to make the application further better, your business lead asked you to consider some more features like, How many cigarettes the person smokes per day/week/month?, How much alcohol consumes per day/week/month, How many litres of water consumes per day? and many more such questions have been added to list. :)

Now as a software developer you can imagine, how difficult it is to build an application for such huge list of features, right? It is too difficult to form meaningful “rules” for a huge list of features. And in the future if any more new features are added/removed you may have to tweak the application again. Obviously it’s kind of problem that the regular application development practices can’t deal with.

This is where the rescue is “Machine Learning”, where certain “algorithms” can take this huge list of features, then learn (i.e understand / figure out) the patterns in data, and make appropriate realistic predictions.

For example, the feature list has changed again, no worries — just supply this changed feature list/dataset to the chosen machine learning algorithm so that it will learn new patterns in data and makes better predictions.

Does not matter how many times the feature list is changing, your algorithm is not changing. Just recollect that, this was not the case with your regular software application, every time when there is a changed feature list, you had to form new rules and tweak the code manually.

To summarise, certain problems like above (prediction problems) can’t be solved by regular application development practices, instead they need a very specific algorithms called as “Machine Learning Algorithms”, where you will choose a specific algorithm (there are many algorithms to choose from) and supply feature list (historical data)to it as input so that it can learn patters in that data to make realistic predictions.

What do machines learn ? , How do machines learn ?

Actually, to be specific, machines (i.e a computer as a hardware) wont learn any thing. But a “Machine Learning Algorithm” using computational power can learn (understand) some patterns in relevant dataset to make predictions. This task is very difficult for a human being to go through such a huge set of data and find patterns manually in it with in a short time.

2. How many Machine Learning algorithms are there? Is there a way to categorise them?

There are many machine learning algorithms in literature, out of them some 10 plus algorithms are widely used to solve variety of problems.

Yes, there are certain ways to categorise machine learning algorithms. I choose a popular way to categorise them based on the data & type of the problem they can solve. There are 3 major categories:

  1. Supervised Learning algorithms
  • These algorithms assume that the historical input data (feature set) has a labeled column. eg. features — hight, weight, visit to doctor.
  • Here, the hight, weight are called as input variables/features (independent variables), the output feature ‘visit to doctor’ is called as output variable or label (dependent variable), here the label can have values like “yes” or “no” OR “1” to represent yes, presence, true. “0” to represent no, absence, false.
  • To consider above example, since our aim is to predict a discrete value or a sensible non-numerical value like “yes or no” as output, we call it as a Classification problem. Examples of Classification algorithms are LogisticRegression (don’t be confused with suffix in it’s name, it is actually a classification algorithm), SVM, DecisionTrees, RandomForests, KNN, NaiveBayes etc.
  • If our aim is to predict a numerical continuous value as output, then it is called Regression problem. Example of Regression algorithms: Linear Regression, Lasso Regression, Polynomial Regression, Ridge Regression, ElasticNet, RandomTrees Regressor, Support Vector Regressor etc.

Note that, both the “Regression” & “Classification” algorithms are always treated as Supervised Learning algorithms. Also, algorithms like SVM, RandomForests can solve both, Regression and Classification problems.

2. Un-supervised Learning algorithms

If the data set does not have labels, and our aim is not to perform prediction but to find some useful patterns, then we use Unsupervised algorithms.

  • Based on some similarity you want to make couple of groups in dataset, an example is you want segment all the animals in planet into different groups like, reptiles, mammals, amphibians etc. This kind of problem is called as “Clustering” problem.
  • Examples of clustering algorithms are K-Means, Agglomerative Hierarchical Clustering, DBSCAL etc.
  • Dimensionality Reduction techniques like PCA are another kind of unsupervised learning algorithms
  • PCA would help us to reduce number of features in dataset. Especially when you have thousands of features in dataset, if you want reduce features to 10s of them, then you can use this PCA technique.

3. Reinforcement algorithms

These are very special algorithms. We will discuss these algorithms in detail on another post. Meanwhile you can refer this: https://en.wikipedia.org/wiki/Reinforcement_learning

3.What kind of problems can be solved by “Machine Learning Algorithms”?

As you can see my explanation in my previous answer, provided relevant dataset, “Machine Learning Algorithms” are suitable for “Prediction (Regression, Classification)” and Segmentation/Grouping (Clustering) problems, where a pattern has to be found on a given huge set of data to make predictions. Here are some usecases:

Given the purchase history of a customer, predict how much discount can be provided to her on future purchases to make her loyal customer. (Regression Problem)

Given a patients health screening data, predict whether the patient is prone to cancer or not. (Classification Problem)

Figure out if a buyer is making a fraudulent transaction (Classification Problem), Figure out whether to approve a loan for a buyer (Classification Problem), if yes how much(Regression Problem).

Perform an object recognition in a video or an image (Deeplearning Problem, involves classification)

Group your customers based on their purchase behaviour, monthly spend etc. (Clustering problem)

4. What does a “Model” mean in context of Machine Learning ?

A model is resulted when you train an algorithm on appropriate data. In other terms, a parameterised algorithm is called as a “model”. Every time you train an algorithm with different set of data, then it will produce a different model.(there is more to explain, but I feel this simple answer is good enough).Here is an example:

ML Algorithm A + Training DataSet A = Model A

ML Algorithm A + Training DataSet B = Model B

ML Algorithm A + Training DataSet C = Model C

5.How do the Machine Algorithms “learn” from data?

Every machine learning algorithm “learns” in it’s own way. It means that it has it’s own mathematical/probabilitical/statistical way to learn the patters in the data. “Learning” happens when you train the algorithm on a dataset. (“fit” is an another term, can be considered as a synonym for “training”)

For example Liner Regression problem uses a linear equation like y = θ0*x0+ θ1*x1+θ2*x2+θ3*x3 …θn*xn, where θ’s are co-efficients / parameters/ weights, y- is a label that you want to predict a value for, x’s are features.When you train a Liner Regression algorithm on a dataset of “xn” features, it tries to find values for θ’s. Finding optimal values for θ’s is called as “Learning”.

Once learning/training is competed, you may get a parmeterised equation like this, using this you can predict values for y.

y = 2.59*x1+ 1.72*x2 + 6.8*x3+ ….+2.7*xn — this equation is called as “Learning Regression Model” for given set of features / dataset. Recollect the statement that, A model is an out come of Training.

The question can be — how does it find optimal values for θ’s ? is an a different discussion altogether.

For example, If you choose a “Naive Bayes Algorithm” to solve a classification problem, it chooses a Probabilitical way to “learn” the patterns in data during training phase.

Note that when learning happens, in Machine Learning no where 100% perfect learn happens . Every ML algorithm ends with some errors when it learns, we have to accept the errors, and we have to try to minimise these errors. If an algorithm learns 100%, then we should not trust it. Because it’s byhearting (memorising) the data, in real time it will do bad predictions. It may do 100% errors in real time data.

As a Machine Learning engineer you have to study these internal mechanism of all popular machine learnings, then only you can use an appropriate machine learning algorithm efficiently to solve a particular problem in hand.

6. How do you evaluate if a Machine Learning algorithm has “learnt” well?

All “Regression” algorithms are meant to learn the patterns, and predict a continuous value.

All “Classification” algorithms are meant to learn the patterns, and predict mostly binary values, like 1 or 0. Here 1 represents yes/positive/true, 0 represents no/negative/false.

Since Regression and Classification algorithms have a different objective to learn from data, they should be evaluated differently.

It’s like if you asked a person to learn “singing”, then you have to evaluate her on “singing” metrics. if you asked a person to learn “playing chess”, then you have to evaluate her on “playing” metrics. (No deep discussion on this 😀).

The same way, we have different performance metrics to evaluate “Regression” problems, popular metrics are — MSE (mean squared error), RMSE (Root mean squared errors) etc.

We have different performance metrics to evaluate “classification” problems, popular metrics are — precision, recall, F1 score, ROC/AUC curves etc.

More on these evaluation metrics in other dedicated posts.

7. What all should be learnt to master at building Machine Models?

To start with, it is essential to understand underlying concepts for below listed items. Once you are good with these, you can enhance your skill in some specific topics as per the need.

DataPreparation/ Statistis / Probability: Usual data cleaning techniques, Descriptive Statistics, Correlation, Variance, Normal Distribution, Binomial Distribution, Conditional Probability, Handing Missing values — Imputation, Dummy Variables creation, ChiSquare method, feature engineering — feature extraction, feature selection.

Programming languages: Any of these are both — R programming language (focus on machine learning related packages), Python programming language (Pandas, Numpy, Sciki-learn), SQL.

Splitting data into Train & Test sets: Random Sampling, Stratified Sampling

Regression Algorithms: Linear Regression, Ridge Regression, Lasso Regression, ElasticNet, Polynomial Regression, Support Vector Regressor, RandomForestRegressor.

Classification Algorithms: Logistic Regression, NaiveBayes, KNN,DecisionTrees, RandomForest, SVM, NeuralNetworks

Clustering Algorithms: KMeans, Hierarchical Clustering, PCA.

Model building: K-Fold Validation,Model Overfitting, underfitting, Bias, Variance,Model evaluation, performance metrics,Regularisation, Optimisation, Gradient Decent / Stochastic Gradient Descent, Ensemble methods (Bagging, Gradient Boosting, Adaboost )

8. What are the related fields of Machine Learning?

Some popular fields where Machine Learning skill is applicable are — DeepLearning — image processing, video analytics, Bio/astronomical research, Natural Language Processing (NLP)- text processing etc.

<TO BE EXTENDED VERY SOON>

--

--