• We first model data with simple models and analyze data for errors.
• These errors signify data points that are difficult to fit by a simple model.
• Then for later models, we particularly focus on those hard to fit data to get them right.
• In the end, we combine all the predictors by giving some weights to each predictor.
…The predictors can be chosen from a range of models like decision trees, regressors, classifiers etc. Because new predictors are learning from mistakes committed by previous predictors, it takes less time/iterations to reach close to actual predictions. But we have to choose the stopping criteria carefully or it could lead to overfitting on training data. Gradient Boosting is an example of boosting algorithm.