Random forest: Bagging Example

Ensemble learning

Ibtissam Makdoun
3 min readJun 16, 2022

On the previous articles we have defined bagging, highlighted how bagging reduces the overall error and when to consider using bagging. Now it is time to start discovering examples of bagging method. So, on of the main algorithm which is based on bagging is: Random Forest. In this articles we will define random forest as an ensemble learning method, then we will explore how it trains and test data.

Sounds interesting? Then let’s dive in.

Defining Random Forest

Random forest is an ensemble learning method that builds many independent deep decision trees based on bagging algorithm, then combines them (using voting for classification and averaging for regression) for more accurate and stable predictions.

We can emphasize two parts of this definition.

  1. Independent: We need to make these decision trees independent for two reasons. Firstly, because when these decision trees are trained independently it allow as to parallelize the training process and make it much faster than boosting where each tree depend on the prior one. Secondly, the goal of each tree is to represent trends in the data, so we want them to be uncorrelated because if they are not then the will be representing similar trends in the data ( which makes them redundant).

--

--

Ibtissam Makdoun

Researcher in Data Science and content creator. Find therapy in Notebooks and Pencils.