<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Kshamashetty on Medium]]></title>
        <description><![CDATA[Stories by Kshamashetty on Medium]]></description>
        <link>https://medium.com/@kshamashetty263?source=rss-af00638e099c------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/0*v8xmxajbCzGEoAFd</url>
            <title>Stories by Kshamashetty on Medium</title>
            <link>https://medium.com/@kshamashetty263?source=rss-af00638e099c------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sun, 24 May 2026 02:24:38 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@kshamashetty263/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[CONFUSION MATRIX]]></title>
            <link>https://medium.com/@kshamashetty263/confusion-matrix-34f75c42b866?source=rss-af00638e099c------2</link>
            <guid isPermaLink="false">https://medium.com/p/34f75c42b866</guid>
            <dc:creator><![CDATA[Kshamashetty]]></dc:creator>
            <pubDate>Sat, 06 Jun 2020 10:45:28 GMT</pubDate>
            <atom:updated>2020-06-06T10:45:28.563Z</atom:updated>
            <content:encoded><![CDATA[<p>By- Shetty Kshama Umesh</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*7WEB1NxhgKrKrdxk.jpeg" /></figure><h4>What is a Confusion Matrix?</h4><p>A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes.</p><p>The matrix compares not accurate but actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is making.</p><p>For a binary classification problem, we would have a 2 x 2 matrix as shown below with 4 values:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/250/0*l2YJpsvtAHIRXLmM.png" /></figure><p>Let’s understand matrix:</p><ul><li>The target variable has two values: <strong>Positive </strong>or <strong>Negative</strong></li><li>The <strong>columns </strong>represent the <strong>actual values</strong> of the target variable</li><li>The <strong>rows </strong>represent the <strong>predicted values </strong>of the target variable</li><li>We don’t know that what are these TP, FP, FN and TN here? That’s the crucial part of a confusion matrix. Let’s have a look for each term below.</li></ul><h3>Understanding True Positive, True Negative, False Positive and False Negative in a Confusion Matrix</h3><p><strong>True Positive (TP)</strong></p><ul><li>The predicted value matches the actual value.</li><li>The actual value was positive and the model predicted a positive value.</li></ul><p><strong>True Negative (TN)</strong></p><ul><li>The predicted value matches the actual value.</li><li>The actual value was negative and the model predicted a negative value.</li></ul><p><strong>False Positive (FP) — Type 1 error</strong></p><ul><li>The predicted value was falsely predicted.</li><li>The actual value was negative but the model predicted a positive value.</li><li>Also known as the <strong>Type 1 error.</strong></li></ul><p><strong>False Negative (FN) — Type 2 error</strong></p><ul><li>The predicted value was falsely predicted.</li><li>The actual value was positive but the model predicted a negative value.</li><li>Also known as the <strong>Type 2 error..</strong></li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/440/0*fHcwkABPKaPgscpO.png" /></figure><h4><strong>Classification Rate/Accuracy:</strong><br>Classification Rate or Accuracy is given by the relation:</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/342/0*tHJDmorWQI94AedR.png" /></figure><p>Let’s see how our model performed:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/320/0*FFSnYBxx1XfuOW-X.png" /></figure><p>The total outcome values are:</p><p>TP = 30, TN = 930, FP = 30, FN = 10</p><p>So, the accuracy for our model turns out to be:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/313/0*XexuAYigULFiWIZG.png" /></figure><p>96%! Not bad!</p><p>But it is giving the wrong idea about the result. Think about it.</p><p>However, there are problems with accuracy. It assumes equal costs for both kinds of errors. A 99% accuracy can be excellent, good, mediocre, poor or terrible depending upon the problem.</p><p><strong>Recall:</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/246/0*4gdBx7qecYtBaCMe.png" /></figure><p>Recall can be defined as the ratio of the total number of correctly classified positive examples divide to the total number of positive examples. High Recall indicates the class is correctly recognized (a small number of FN).</p><p><strong>Precision:</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/249/0*5hM-FPTfPt--PF3h.png" /></figure><p>To get the value of precision we divide the total number of correctly classified positive examples by the total number of predicted positive examples. High Precision indicates an example labelled as positive is indeed positive (a small number of FP).</p><p><strong>High recall, low precision: </strong>This means that most of the positive examples are correctly recognized (low FN) but there are a lot of false positives.</p><p><strong>Low recall, high precision: </strong>This shows that we miss a lot of positive examples (high FN) but those we predict as positive are indeed positive (low FP)</p><p><strong>F-measure:</strong><br>Since we have two measures (Precision and Recall) it helps to have a measurement that represents both of them. We calculate an F-measure which uses Harmonic Mean in place of Arithmetic Mean as it punishes the extreme values more.</p><p>The F-Measure will always be nearer to the smaller value of Precision or Recall.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/358/0*cPxMj2fmSrKyblMd.png" /></figure><p><strong>Code : Python code to explain the above explanation</strong></p><p># Python script for confusion matrix creation.</p><p><strong>from sklearn.metrics import confusion_matrix</strong></p><p><strong>from sklearn.metrics import accuracy_score</strong></p><p><strong>from sklearn.metrics import classification_report</strong></p><p><strong>actual = [1, 1, 0, 1, 0, 0, 1, 0, 0, 0]</strong></p><p><strong>predicted = [1, 0, 0, 1, 0, 0, 1, 1, 1, 0]</strong></p><p><strong>results = confusion_matrix(actual, predicted)</strong></p><p><strong>print &#39;Confusion Matrix :&#39;</strong></p><p><strong>print(results)</strong></p><p><strong>print &#39;Accuracy Score :&#39;,accuracy_score(actual, predicted)</strong></p><p><strong>print &#39;Report : &#39;</strong></p><p><strong>print classification_report(actual, predicted)</strong></p><p><strong>Output:</strong></p><pre>Confusion Matrix :<br>[[4 2]<br> [1 3]]<br>Accuracy Score : 0.7<br>Report : <br>              precision    recall  f1-score   support<br>          0       0.80      0.67      0.73         6<br>          1       0.60      0.75      0.67         4<br>avg / total       0.72      0.70      0.70        10</pre><h3>Confusion Matrix for Multi-Class Classification</h3><p>How would a confusion matrix work for a multi-class classification problem? Well, don’t scratch your head! We will have a look at that here.</p><p>Let’s draw a confusion matrix for a multi-class problem where we have to predict whether a person loves Facebook, Instagram or Snapchat. The confusion matrix would be a 3 x 3 matrix like this:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/305/0*pNSflXRMIK2jOz0z.png" /></figure><p>The true positive, true negative, false positive and false negative for each class would be calculated by adding the cell values as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/688/0*_e0xMrP9fg1Ge61u.png" /></figure><p>That’s it. You are ready to decipher any N x N confusion matrix!</p><p>We can also get a normalized confusion matrix using:</p><pre>df_conf_norm = df_confusion / df_confusion.sum(axis=1)<br></pre><pre>Predicted         0         1         2<br>Actual<br>0          1.000000  0.000000  0.000000<br>1          0.000000  0.333333  0.333333<br>2          0.666667  0.333333  0.500000</pre><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=34f75c42b866" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>