Ada-boost, like Random Forest Classifier is another ensemble classifier. (Ensemble classifier are made up of multiple classifier algorithms and whose output is combined result of output of those classifier algorithms).
In this chapter, we shall discuss about details of Ada-boost classifier, mathematics and logic behind it.
What does Ada-boost classifier do?
Ada-boost classifier combines weak classifier algorithm to form strong classifier. A single algorithm may classify the objects poorly. But if we combine multiple classifiers with selection of training set at every iteration and assigning right amount of weight in final voting, we can have good accuracy score for overall classifier.
In short Ada-boost ,
- retrains the algorithm iteratively by choosing the training set based on accuracy of previous training.
- The weight-age of each trained classifier at any iteration depends on the accuracy achieved.
Good! This leaves us with questions:
- How do we select the training set?
- How to assign weight to each classifier?
Lets explore these questions, mathematical equation and parameters in behind them.
How do we select the training set?
Each weak classifier is trained using a random subset of overall training set.
But wait there’s a catch here… random subset is not actually 100% random!
After training a classifier at any level, ada-boost assigns weight to each training item. Misclassified item is assigned higher weight so that it appears in the training subset of next classifier with higher probability.
After each classifier is trained, the weight is assigned to the classifier as well based on accuracy. More accurate classifier is assigned higher weight so that it will have more impact in final outcome.
How to assign weight to each classifier?
A classifier with 50% accuracy is given a weight of zero, and a classifier with less than 50% accuracy is given negative weight.
Lets look at the mathematical formula and parameters.
h_t(x) is the output of weak classifier t for input x
alpha_t is weight assigned to classifier.
alpha_t is calculated as follows:
alpha_t = 0.5 * ln( (1 — E)/E) : weight of classifier is straigt forward, it is based on the error rate E.
Initially, all the input training example has equal weightage.
A plot of alpha_t v/s error rate
Updating weight of training examples
After weak classifier is trained, we update the weight of each training example with following formula
D_t is weight at previous level.
We normalize the weights by dividing each of them by the sum of all the weights, Z_t. For example, if all of the calculated weights added up to 15.7, then we would divide each of the weights by 15.7 so that they sum up to 1.0 instead.
y_i is y par of training example (x_i, y_i) y coordinate for simplicity.
Adaboost like random forest classifier gives more accurate results since it depends upon many weak classifier for final decision. One of the applications to Adaboost is for face recognition systems.
I hope this article was successful in explaining you the basics of adaboost classifier.
If you liked this post, share with your interest group, friends and colleagues. Comment down your thoughts, opinions and feedback below. I would love to hear from you. Follow machine-learning-101 for regular updates. Don’t forget to click the heart(❤) icon.
You can write to me at firstname.lastname@example.org . Peace.