Visualizing Decision Tree with R

Valentina Alto
Analytics Vidhya
Published in
6 min readAug 23, 2019

--

Decision trees are some of the most popular ML algorithms used in industry, as they are quite interpretable and intuitive. Indeed, they mimic the way people logically reason.

The basic recipe of any decision tree is very simple: we start electing as root one feature, split it into different branches which terminate into nodes, and then, if needed, proceed with further splitting on other features. Finally, we go back and “prune the leaves” to try to reduce overfitting.

In this article, I’m not going to dwell on the final procedure of ‘pruning leaves’, since I want to focus on the splitting criteria: how can your algorithm decide which feature needs to be elected as root? There are different answers to this question, depending on the decision tree algorithm you are going to employ. Here, I will explain the typical criterion of ID3 decision tree, that is the Information Gain criterion.

Hence, we need to first introduce the concept of information.

Information is determined by observing the occurrence of an event. Namely, given a random variable X with possible outcomes x1,…….,xn and their respective probabilities of occurrence p1,…….,pn, the information we want to gain is the value of that variable X.

The size or amount of information we are provided with is measured by bits, and there is a…

--

--

Valentina Alto
Analytics Vidhya

Data&AI Specialist at @Microsoft | MSc in Data Science | AI, Machine Learning and Running enthusiast