Machine Learning: Logistic Regression
In this blog, we’ll learn any Machine Learning algorithm called Logistic Regression. If we go by the name then it seems that it is similar to linear regression but there’s a difference. Linear Regression is used to predict a continuous value based on certain features i.e. it’s a regression algorithm but Logistic Regression is a classification algorithm.
Logistic regression is a supervised machine learning algorithm which is used for classification problems.
Consider a dataset where binary classification is there i.e. either 0 or 1. If we try to predict the value or classify them using linear regression then our regression line will show up like this.
From the above image, we can see that the value predicted through linear regression is not adequate as there’s a value for which output is below 0 which in any case can’t be possible i.e makes no sense at all.
So, we can convert a linear Regression to Logistic Regression which in turn will be a good fit for our dataset for classification as it will not have a value or probability of less than 0.
So, to convert a linear regression to logistic regression, we can follow many ways but here I’ll discuss Sigmoid Function. Consider the below image for better understanding.
From the sigmoid function, we can see that whenever we put any value is z then the output is always between 0 and 1 and this is how logistic regression works.
We put a linear regression predicted value in place of z and the sigmoid function gives output between 0 and 1. The values higher than 0.5 is considered as 1 and those less than 0.5 as taken as 0.
To check the accuracy of Logistic Regression, we use a Confusion Matrix.
A confusion matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known.
It looks something like this:
Four cases are possible
- True Positives (TP): The values for prediction is positive and actual value is also True.
- True Negatives(TN): The values for prediction is negative and actual value is also False.
- False Positives (FP): The values for prediction is positive but actual value is False.
- False Negatives (FN): The values for prediction is negative but actual value is True.
Accuracy
Accuracy is the common term we have heard everywhere when talking about the model’s efficiency. It is given by:
Accuracy = (TP+TN) / (TP+TN+FP+FN)
Higher the accuracy, the better the performance of our model.