Confusion Matrix

Pravinkumar Singh
2 min readSep 30, 2021

--

Evaluation is the important step in the process of building a successful machine learning model. Classification and Regression both has different evaluation metrics respectively. As we are talking about confusion matrix it is a classification model evaluation metric. Confusion matrix is an easy approach to evaluate a model as it compare predicted labels of trained model with actual label i.e. true label present in datasets. Depending only one evaluation metrics can be a huge mistake depending on the type of problem. For example for the machine learning model of healthcare system evaluation must be done with all the approaches to get clear understanding of the predictions made by model.

Let’s see some code examples.

Import the libraries

import pandas as pd 
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import matplotlibe.pyplot as plt
import seaborn as sns

load the dataset

data = pd.read_csv("heart-disease.csv")

split the dataset into training and testing

X = data.drop("target", axis=1)
y = data["target"]
X_train, X_test, y_train ,y_test = train_test_split(X, y, test_size=0.2)

training the model

clf = RandomForestClassifier()
clf.fit(X_train, y_train)

predicting with test data

y_preds = clf.predict(X_test)

storing the predicted labels in variable y_preds because confusion_matrix() function requires two parameters which are predicted labels and actual labels

creating the confusion matrix with pandas library

pd.crosstab(y_test, y_preds,
rownames=[“Actual Labels”],
colnames=[“Predicted Labels”])

creating the visual representation of confusion matrix using seaborn’s heatmap function

sns.set(font_scale=1.5)conf_mat = confusion_matrix(y_test, y_preds)sns.heatmap(conf_mat, annot=True, cmap=winter)plt.xlabel("Predicted Label")
plt.ylabel("Truth Label")

That’s it I hope you find it helpful. Feedback is needed.

Happy Learning :) .

--

--