Member-only story
Tree Ensembles: Bagging, Boosting and Gradient Boosting
Theory and practice explained in detail
A tree ensemble is a machine learning technique for supervised learning that consists of a set of individually trained decision trees defined as weak or base learners, that may not perform well individually. The aggregation of the weak learners produce a new strong model, which is often more accurate than the former ones. There are three main types of ensemble learning methods: bagging, boosting, and gradient boosting. Every method can be used with other weak learners, but in this post, only trees are going to be taken into account.
The rest of the article is divided into two sections:
- Intuition & History. It explains the origin and gives a short description of each ensemble learning method.
- Practical Demonstration. Develops every ensemble learning method step by step. For this purpose, a small synthetic dataset is also presented to help with the explanations.
All images unless otherwise noted are by the author.
Intuition & History
Bagging
The term was first defined by Breiman (1996) [1], and it is an acronym for Bootstrap Aggregation. For this ensemble, each decision tree uses as input data…