Learn Decision Tree from Predicting whether a cricket match will be played or not and Code by yourself!

Dilkhush Mihirsen DHINESH KUMAR
3 min readAug 3, 2020

--

A cricket match is planned based on various factors such as Weather, Temperature, Humidity and Wind. A Decision is made by us based on these factors.

For example, if the weather is sunny, temperature is mild, humidity is normal and wind is normal it’s an ideal environment to play a match. This same process is done by the computers comparing various parameters of the factors and the result is given as whether to conduct a match or not.

What is a Decision tree?

The various features are selected as nodes(Rectangle boxes) and their parameters or values are taken as edges(lines connecting the nodes). Based on the value of the features decision is taken and checked with the next feature.

Let’s Code!

Now we are going to build a Decision Tree machine learning model using python and some libraries. Libraries are set of programs already written to make the calculations simpler. If you do not know the common machine learning terminologies like Model, Training, etc. please do visit my article on Basic Terminologies of Machine Learning using this link. Let’s start to code!

Here we have imported the necessary libraries and packages for us to perform the simple linear regression. The libraries and packages imported are:

  1. Numpy: This is a package that is used for scientific calculations and array calculations in python.
  2. Pandas: This is a powerful package that has some functions for Data Analysis and Manipulation.
  3. Sklearn: This is a free machine learning library that contains many functions and methods that are necesary to build a machine learning model. In Sklearn we have imported two functions LabelEncoder and Tree. LabelEncoder is used to convert the categorical variables into numerical variables and the Tree function contains the built-in package for Decision Tree.

Here we are importing the dataset named “PlayCricket.csv” and displaying their values before LabelEncoding. The link to the dataset is here.

The Decision Tree Algorithm takes the inputs only as numbers and that’s why we are using the LabelEncoder function to change the values to numerical. Here you can see that in the outlook column overcast is encoded to 0, rainy is encoded to 1 and sunny is encoded to 2. The same is done for al the values.

Here we are selecting the target variable(y) whether we can play or not and the features to predict the target variable in the variable(x).

Here we are calling the DecisionTreeClassifier from the Tree and fitting the model to our variables x and y.

At last we are predicting using the trained model and we are printing their values after checking with the original value of whether a game acn be played or not.

Here is the link to the full code for you to have a hands-on experience. Any queries please do contact me through LinkedIn. Happy Learning!!!

--

--

Dilkhush Mihirsen DHINESH KUMAR

A platform to learn technologies right from scratch using simple real life experiences.