Confusion Matrix and Data Imbalances (3/3)

5 min readSep 12, 2023

Previous << Confusion Matrix and Data Imbalances (2/3)

Here, we’ll learn about different metrics, using them to explain the results obtained from the binary classification model we built in the previous exercise.

Data Visualization

We’ll use the dataset with different classes of objects found on the mountain one more time:

import pandas
import numpy
!wget https://raw.githubusercontent.com/MicrosoftDocs/mslearn-introduction-to-machine-learning/main/graphing.py
!wget https://raw.githubusercontent.com/MicrosoftDocs/mslearn-introduction-to-machine-learning/main/Data/snow_objects.csv

#Import the data from the .csv file
dataset = pandas.read_csv('snow_objects.csv', delimiter="\t")

#Let's have a look at the data
dataset

           size     roughness     color     motion     label
0       50.959361    1.318226     green    0.054290    tree
1       60.008521    0.554291     brown    0.000000    tree
2       20.530772    1.097752     white    1.380464    tree
3       28.092138    0.966482     grey     0.650528    tree
4       48.344211    0.799093     grey     0.000000    tree
 ...       ...          ...        ...        ...       ...
2195    1.918175     1.182234     white    0.000000    animal
2196    1.000694     1.332152     black    4.041097    animal
2197    2.331485     0.734561     brown    0.961486    animal
2198    1.786560     0.707935     black    0.000000    animal
2199    1.518813     1.447957     brown    0.000000…

Confusion Matrix and Data Imbalances (3/3)

Data Visualization

Written by The V Notebook