Gentle Introduction to Cost-Sensitive Decision Tree
In this article, I will present how decision tree can be used in imbalanced data set as a cost-sensitive supervised learning algorithm.
Decision tree is a commonly used algorithm for classification and regression. Decision tree for classification uses tree structure to classify the instances. The merit of decision tree is that it is highly readable.
As a tree structure, it consists of one root node, several internal nodes and several leaf nodes. Lead nodes determine results. Internal nodes are tests for feature values and instances in one node are allocated to its child nodes according to the feature test.
Root node contains all the instances. Therefore, each path from root node to leaf node is a test sequence and the goal is to learn a decision tree that with great generalization ability which deals with new instances never seen before.
Loss Function in Decision Tree
The loss function of decision tree is a maximum likelihood function with regularization. Selecting optimal tree from all possible trees is an NP problem. Therefore, heuristic strategy is always used to tackle this optimization problem to obtain a sub-optimal solution. The process of building a decision tree is a divide-and-conquer strategy, as following: