Published in

SyncedReview

# Intro

The effectiveness of ensemble methods is successfully proven by data competitions: 60% of winning solutions in Kaggle challenges used XGBoost (a popular boosting algorithm). An extremely strong statement from a winner of those competitions, Qingchen Wang just about sums it up: “I only use XGBoost”.

To combine the best properties of bagging and boosting, the author proposed a hybrid algorithm — InfiniteBoosting, which constructs an arbitrary large ensemble tree and trains every weak learner in a gradient boosting way. Due to its hybrid features, this algorithm builds a classifier reducing bias at each training step, while controlling the over-fitting at the same time.

# Bagging and Boosting

Following this idea, Random Forest algorithm is introduced naturally, since decision tree is a perfect candidate for weak learner. A single decision tree holds properties such as low bias and high variance. After aggregating a pack of trees, bias remains but variance goes down. We could achieve a constant bias that is as low as possible by constructing a large enough random forest.

Now let’s turn to boosting. The basic idea behind boosting is whether a weak learner could be modified to become better by focusing on improving its weakness. This is done by using the weak learning method multiple times to get a succession of hypotheses, each one refocusing on the examples that the previous ones found difficult and have misclassified. By doing so, boosting helps a weak learner grow into a strong one. The equation below shows that the final output of boosting is the weighted average (voting) of weak learners, where alpha_t is the weight calculated by considering the last iteration’s error.

The image below shows the typical learning procedure of boosting algorithms. Each learner learns from the failures of the last learner, and outputs correct result by accumulating all learners’ output.

# Analysis

The effectiveness of this algorithm is shown by the authors: it converges almost surely to the solution, as long as each tree’s output on the training data is bounded and continuous. Meanwhile, we should notice that the error of any machine learning algorithm is the sum of bias and variance. Bagging methods reduces the variance by making uncorrelations among learners to minimize the variance, but this causes bias to go up slightly. Boosting methods reduce bias by aggregating different models’ outputs, but causes variance to blow up. Thus, InfiniteBoosting, a hybrid of two methods, can never break through the limits of any of the two methods, and it may infinitely approach the results of bagging or boosting.

The obvious advantage is that InfiniteBoosting allows bagging or boosting to work on the tasks that they may not fit in the first place. In other words, this method is robust on different datasets or tasks. Given enough data, it could always achieve one of the best performance according to bagging and boosting.

However, to converge to a stationary point, it requires large enough datasets and longer training time. I guess this disadvantage will not impede too many applications, since data is everywhere nowadays, and computation resources are growing with time.

# Experiment Results

From the result, we could conclude that InfiniteBossting shows similar, but slightly better results compared to Gradient Boost or Random Forest methods. The results also agree to the analysis performed in the last section.

# Conclusion

Author: Qintong Wu | Localized by Synced Global Team: Xiang Chen

--

--

## More from SyncedReview

We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.

## Get the Medium app

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global