The Three Bs: Bootstrapping, Bagging, & Boosting!

Rowan Curry
2 min readDec 16, 2021

--

Many important ensemble learning algorithms are based off of the Three Bs. Understanding these fundamental Machine Learning concepts is essential to using ensemble learning techniques successfully.

If you’ve never encountered ensemble learning before, I suggest quickly checking out this resource.

The following article will take you through a short and sweet (and comprehensive!) explanation of these concepts.

BOOTSTRAPPING!

Bootstrapping is a simple concept used as a building block for more advanced algorithms, such as AdaBoost and XGBoost.

Bootstrapping refers to bootstrap sampling. This is a resampling method that uses random sampling with replacement. “With replacement” means that it’s possible for an observation that was already chosen to be chosen again.

Bootstrapping has more applications than building algorithms. Under the assumption that the sample is representative of the population, bootstrap sampling can be conducted to provide an estimate of the sampling distribution of a sample statistic.

This application of bootstrap sampling can be super helpful when you’re dealing with a sampling distribution that’s not large enough to assume that the sampling distribution is normally distributed, but you need to estimate the parameters of a population.

BAGGING!

Bagging, which is short for bootstrap aggregating, builds off of bootstrapping. Bootstrap aggregating describes the process by which multiple models of the same learning algorithm are trained with bootstrapped samples of the original data set.

BOOSTING!

Boosting is Bagging 2.0. In this process, each individual model is built sequentially by iterating over the previous model. This means that data points that are falsely classified by the previous model are emphasized in the following model. Understandably, this increases the overall accuracy of the model.

And there you have it! A quick and clear explanation of the three Bs.

--

--

Rowan Curry

Data Scientist. Very excited about all things data. All views are my own.