Survey of the Decision Trees Algorithms (CART, C4.5, ID3)

Aydin Abedinia
3 min readFeb 4, 2019

--

Decision trees are one of the favourite techniques that data scientists have; Decision Trees like 20 questions game; without any knowledge, try to ask questions with limitation 20 to guess the answer.

Decision trees have many techniques like ID3, C4.5, CART. Each of the methods tries to separate data with more information gain. What is the information gain?

“in decision tree method, information gain approach is generally used to determine suitable property for each node of a generated decision trees.”

Why are decision trees so important? State-of-art algorithms like Gradient-Boosting, XGBoost, LightGBM are boosting algorithms, and they are beneficial and powerful. Reference is Kaggle Competition and Blogs of Companies like UBER.

On the Michelangelo platform, the UberEATS data scientists use gradient boosted decision tree regression models to predict this end-to-end delivery time.

XGBoost is an efficient algorithm to resolve classification and Regression problems in machine learning and data science filed; XGBoost was introduced in 2016 by Tianqi Chen and Carlos Guestrin from Washington University.

Among the machine learning methods used in practice, gradient tree boosting is one technique that shines in many applications [Chen, Tianqi, and Carlos Guestrin

the model of XGBoost uses the CART to build decision trees in each step of the boosting process.

I mention this video, and google developers show how the CART algorithm can split data with Statistical Equation.

The CART uses GINI Impurity to split data, first choose all candidates, then split data and calculate Gini impurity to know which one is the best choice. If you watch the video, you know about how decision trees (the CART) works.

There are many statistical techniques to calculate information gains like The GINI and Entropy; they are a conventional technique to use in decision trees. the CART uses Gini Impurity.

GINI Impurity Equation [https://en.wikipedia.org/wiki/Decision_tree_learning]
Entropy Equation [https://en.wikipedia.org/wiki/Decision_tree_learning]

here comparisons of methods :

A survey on decision tree algorithms of classification in data mining

Now you see what’s different in decision trees and which one supported the boosting feature. I truly recommend reading about oblique decision trees and why they fast

--

--

Aydin Abedinia

Engineering manager @Snapp | SE, ML, MLOps | MSc Software engineer