Confusion Matrix and Data Imbalances (3/3)

The V Notebook
5 min readSep 12, 2023

Previous << Confusion Matrix and Data Imbalances (2/3)

Here, we’ll learn about different metrics, using them to explain the results obtained from the binary classification model we built in the previous exercise.

Data Visualization

We’ll use the dataset with different classes of objects found on the mountain one more time:

import pandas
import numpy
!wget https://raw.githubusercontent.com/MicrosoftDocs/mslearn-introduction-to-machine-learning/main/graphing.py
!wget https://raw.githubusercontent.com/MicrosoftDocs/mslearn-introduction-to-machine-learning/main/Data/snow_objects.csv

#Import the data from the .csv file
dataset = pandas.read_csv('snow_objects.csv', delimiter="\t")

#Let's have a look at the data
dataset
           size     roughness     color     motion     label
0 50.959361 1.318226 green 0.054290 tree
1 60.008521 0.554291 brown 0.000000 tree
2 20.530772 1.097752 white 1.380464 tree
3 28.092138 0.966482 grey 0.650528 tree
4 48.344211 0.799093 grey 0.000000 tree
... ... ... ... ... ...
2195 1.918175 1.182234 white 0.000000 animal
2196 1.000694 1.332152 black 4.041097 animal
2197 2.331485 0.734561 brown 0.961486 animal
2198 1.786560 0.707935 black 0.000000 animal
2199 1.518813 1.447957 brown 0.000000…

--

--

The V Notebook

I'm👩‍💻who have passion for tech, heart for data. My mission? Turning numbers into chapters, algorithms into stories. Let's ride the data science wave! 💻🌊✨