Why Linear Regression is not suitable for classification?
Have you ever wondered why there are different algorithms for every problem? Let us consider a simple example of Linear Regression and Logistic Regression.
There are two things that explain why Linear Regression is not suitable for classification. The first one is that Linear Regression deals with continuous values whereas classification problems mandate discrete values.
The second problem is regarding the shift in threshold value when new data points are added. Let us take a simple Linear Regression example and fit a line to it. The below graph shows the best fit line. To make the explanation a bit more simple let us take an example of a classification problem in healthcare. Basically, our aim here is to classify whether a person is sick or not.
Technically the hypothesis function for linear regression can be used for logistic regression also but there is a problem.
The above graph shows the best fit line for the given points. This is a simple example and the real-world data is never this simple. So coming back, when we add another point to this dataset, our best fit line shifts to fit that point. Hence the line becomes like this
So according to our example, if we have to classify whether a person is sick or not, this method is not reliable at all since we use a sigmoid function in logistic regression. If there is a shift in the line then the minimum threshold value for classifying the examples changes. So in the beginning, the threshold might be 0.5 but after adding few more data points the value might be 0.8. So the accuracy of the model takes a hit here. Also deploying a model with such instability in a field such as healthcare is not so good and can have dire consequences.
Hence to avoid this we use an algorithm called the Logistic Regression which is a binary classification algorithm to stepover these practical problems that hold back Linear Regression for classification.
Logistic Regression deals with discrete values unlike Linear Regression and also maintains the value of the threshold even when new data points are added.