Performance measurement (Part-I)

Priyanka Goel
3 min readSep 13, 2020

--

Confusion matrix for measuring the performance of Machine learning models.

Performance measures

Machine learning is all about classifying data-points to one or the other class.

When ever we use any particular algorithm it do give us some result after passing test data-set.

But how to evaluate the performance of the algorithm? How well it worked for us?

Simple answer for this could be to find out the accuracy for it.

but it is not always good to use accuracy for the performance measure as the results could be miss-leading and we might get accuracy as high as 99% for a highly biased model.

So if not accuracy then what should we use for performance measure?

better way to evaluate the performance of a classifier is to look at the confusion matrix.

Confusion Matrix

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier.

Refer below diagram to understand it more clearly-

Confusion Matrix

Though the overall idea behind confusion matrix is relatively simple but still the related terminologies can still be confusing.

Let us understand each one of them with a simple example-

We have 2 people talking to each other and there could be two possible predicted classes: “yes” and “no”.

“yes”- when it is predicting a person to be a girl.

“no”- when it is predicting a person not a girl.

TP(True Positives) : Case where our model predicted someone to be a girl and it is actually a girl. In-short for correctly predicted event values.

True Positive case

TN(True Negatives): When our model predicted someone not a girl and the person is actually not a girl. In-short for correctly predicted no-event values.

True Negative Case

FP(False Positives): When our model predicted someone to be a girl but the person is actually not a girl. In-shot for incorrectly predicted event values.

False Positive case

FN(False Negatives): When our model predicted someone not to be a girl but the person is actually a girl. In-short for correctly predicted no-event values.

False Negative Case

Now that we are clear with basic terminologies, let us understand few more.

True Positive Ratio : It is the division of True Positives(TP) with total number of actual positive point.(TP/P).

True Negative Ratio : It is the division of True Negatives(TN) with the total number of actual negative points.(TN/N).

False Positive ratio : It is the division of False Positives(FP) with the total number of actual negative points(FP/N).

False Negative ratio : It is the division of False Negatives(FN) with the total number of actual positives points(FN/P).

We should always avoid FNs as these can be really dangerous for real world problems.

For more information on Precision, Recall, F1 score etc please visit my blog performance measurement (part-II ).

References:

--

--