Statistical Learning vs Machine Learning

Subtle differences

Phani Srikanth
Jun 3, 2014 · 2 min read

There is a subtle difference between statistical learning models and machine learning models.
Statistical learning involves forming a hypothesis before we proceed with building a model. The hypothesis could involve making certain assumptions which we validate after building the models.

For example, let us consider Linear Regression(LR) which is an example of a statistical model. While building a LR model, a set of 3 assumptions are made.

  • All the residuals follow a normal distribution around the mean.
  • The attributes in the dataset are all independent.
  • There is homoscedasticity in the data.

The model is assumed to take form, Y= b1 + b2X. So, we finally end up with an equation of precisely this form, b1 and b2 being the unknown coefficients.

With the assumptions regarding the model and the type of equation being made, a cost function is calculated and minimized using methods like gradient descent and thus we finally arrive at a LR model and diagnose our model if the assumptions we made are followed by the data. If the assumptions are not fulfilled, we reject the initial hypothesis and start over again.

So, our initial hypothesis certainly plays an important role in the case of statistical learning models.

But, in the case of machine learning(ML) models, we directly run the ML algorithms on the model, thus allowing the data to speak out instead of directing it in a certain direction with our initial hypothesis/assumptions.

For example, while building a decision tree/random forest, we assume no hypotheses and directly run the algorithms. The ML algorithm returns the crucial features and their importance. Here, we are not setting up any hypotheses which might affect our final model. The model totally learns the data without any user imposed conditions.

Thus, the machine learning models are said to be flexible in nature, because the user doesn't intervene in telling a model how to build an equation/classifier and thus learning the data better!

Data Science | Analytics

Hoopla surrounding Big Data Analytics

Phani Srikanth

Written by

www.phani.io | I enjoy working on Machine Learning, attending live concerts and following Test Cricket.

Data Science | Analytics

Hoopla surrounding Big Data Analytics

More From Medium

Related reads

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade