Ensemble Learning Relation With Bias and variance

Ashish Patel
ML Research Lab
Published in
5 min readMay 16, 2019

Ensemble learning series…!!!

Hey Folks, After so long i am coming back again here with most important and demanding topic of machine learning which will give you how to design proper model for production.So, I thought to write whole series of Ensemble learning. after reading this series you know

  • What is Ensemble learning and relation with Bias and variance.
  • Ensemble learning types with practical example.
  • We try to cover deep learning with ensemble learning integration in this series. So Stay tuned.

In this article, We discuss the bias and variance relation with ensemble learning. So let’s Begin….!!!

Ensemble learning Series — (Happy Learning…!!!)

  1. Ensemble Learning Relation With Bias and variance
  2. Ensemble Learning- The heart of Machine learning
  3. Bagging — Ensemble meta Algorithm for Reducing variance
  4. Boosting- Ensemble meta Algorithm for Reducing bias
  5. Stacking -Ensemble meta Algorithms for improve predictions
  6. Ensemble learning impact on Deep learning

1. Why we do Ensemble Learning?

As we know that we have many reason to do Ensemble learning(Meta Machine learning).

  • In the group of Algorithm, Several model are simple and take less computation power and some are complex and taking more computation.
  • When we put the model for production, that time accuracy and computation time is most important concern.
  • Imagine that you have train a model with high accuracy But for real time application has no use. and in another scenario We have simple model(Simple Algorithm) may have less accuracy and not fit on data properly. In this cases, we compromise with accuracy and computation time.
  • To Solve this problem, We can train a model on different algorithms(Weak Learners) to get a average result(Confidence index) from them. which will help you to implement real time application with very nice accuracy.

2. Reasons to use ensemble

  1. The dataset is too large or small — If dataset is too large or small we have to use sampling to choose sample to take average of the result.
  2. Complex(Non-linear) data — Real time dataset is mostly in non-linear fashion. so when we train a single model which cannot define the class boundary clearly and model become under-fit. That case we have to take different sub sample and take average of different model.
  3. High Confidence — when we train a model with multiple classes and get high correlated output these situation lead the High Confidence. So, In this case most of the model predict the same class which lead that high confidence.

3.How to Create Ensemble System

  • All models should have a difference of population. we have to split dataset in subset in such a way, every subset has less correlation between each other. So, The result different classification model will create which can give u independent result.
  • Every model should have to give different hypothesis results. Our expected result should be very in each model. this will help to generalized ensemble system.
  • Based on the model category we can visualize the data to get different perspective view, It should be either (linear or non-linear) or (supervised or Unsupervised).

4.Quantification of Performance

  • Calculate the performance of model by calculating the difference between Input and Output.
Courtesy — Chris Albon
  • Error Show’s that basic three component : BIAS, VARIANCE, IRREDUCIBLE ERROR
  • We can’t handle Irreducible error but we can handle Bias and variance.

5. Bias Variance trade-off

Bias Error : (High Bias Under-fitting)

  • Bias is average of difference between predicted and Actual result. High Bias means We are getting low performance.
Bias Error
  • When we train a linear on complex data we can see the error with with black dot and red line distance of error E1,E2,E3. Simple meaning of this our model is not fitting properly thus under-performing, for this case we have to use Complex(polynomial) Model to avoid this scenario.

Variance Error : (High Variance over-fitting)

  • Quantify the difference of predicted value in the same as observation on that time model is over-fitting.
  • Train a model show high variance near 100% Accuracy on training data. when we check the model on present data this fail to predict correct result.
  • In two condition this happen: 1) Less training data 2) Complex model on simple data.
Variance Error
  • In above figure, We can see that our model perform well on black training example. But, when we predict the value of Red dot we see the error difference it’s variance error.
  • The problem of over-fitting can be solved by increasing the number of training instances or by choosing the correct classifier for prediction.
Courtesy — “Understanding The Bias-Variance Trade-off.” Towards Data Science.
  • In above Figure, Imagine that Center of the Circle(O) contain the Actual value and Cross ( X ) represented predicted value.
  • When We have High bias increase model complexity so we get low bias. We expect the result low bias and low variance.
  • Managing Bias and variance in balance way here Ensemble learning is come in the picture.

I hope you are enjoying this article. Thanks for reading…!!!

References :

--

--

Ashish Patel
ML Research Lab

LLM Expert | Data Scientist | Kaggle Kernel Master | Deep learning Researcher