[ML Shot of the Day]: Discretization of Continuous Attributes
Handling Continuous features in Decision Trees
Choosing the optimal splitting point for continuous attributes in Decision Trees
Published in
4 min readJun 5, 2021
A Crash Course on Decision Trees and Splitting Measures:
- Decision Trees and its variants, Random Forests, XGBoost, CatBoost are popularly used in the Machine Learning world (including competitions).
- Training a Decision Tree for a classification problem involves recursively splitting the data into smaller subsets until each node contains data belonging to a single class.
- Different measures (Information Gain, Gini Index, Gain ratio) are used for determining the best possible split at each node of the decision tree.
Splitting Measures for growing Decision Trees:
- Recursively growing a tree involves selecting an attribute and a test condition that divides the data at a given node into smaller but pure subsets.
- The measures used for determining the best split computes the degree of impurity of the child nodes.
- Computing the impurity of child nodes with respect to that…