Logistic Regression

ranga_vamsi
The Startup
Published in
4 min readJul 14, 2020

What if I tell you that it is a Classification Algorithm.

Logistic Regression

It’s a classification algorithm, used to predict the probability of the categorical dependent variable from one or more independent variables. However, unlike linear regression, the dependent variables can be continuous or categorical, and also instead of fitting a model into a straight line or hyperplane, the logistic regression uses the logistic function to squeeze the output of a linear equation between 0 and 1.

Source: Google Images.

Binomial Logistic Regression deals with situations when the dependent variable has two levels(categorical). Problems with the dependent variable having three or more values are modeled as Multinomial Logistic Regression and Ordinal Logistic Regression deals with ordered multiple categories.

Why Logistic?

Logistic Function

Logistic Function or Logistic Curve always gives a value ranging between 0 and 1 (0<p<1). It gives an S-shaped curve with asymptotes at 0 and 1, and it crosses the y-axis at 0.5. It is also known as the Sigmoid function.

Sigmoid Function
Sigmoid Activation Function

The logistic regression hypothesis generalizes from the linear regression hypothesis with output transformed by the logistic function.

If we predict the probability using linear regression hypothesis which is:

Linear Regression Hypothesis

The Linear regression model can generate the predicted probability as any number ranging from -ve to +ve infinity, whereas the probability can only lie between 0 and 1. So, the logistic regression hypothesis forces the output into a logistic function to assume only values between 0 and 1 in the.

Logistic Regression Hypothesis

Cost Function

Linear regression uses mean squared error as its cost function.

The Cost Function of Linear Regression

If this is used in logistic regression, then it would be of no use as it would end up being a non-convex function with many local minimums. Gradient Descent will converge only if the function is convex.

So, we apply a logarithmic function to the above equation to get rid of local minima.

The Cost Function of Logistic Regression

Simplified Cost Function,

Simplified Cost Function

Gradient Descent

The cost function is minimized i.e., min J(θ) using the gradient descent approach.

Now to minimize the cost function we need to run the gradient descent function on each parameter(θ) i.e.,

Gradient Descent

The complete derivation of the above equation can be found in this link.

Decision Boundary

A threshold can be set to predict the class label of our data. Based on this threshold, the estimated probability outcome is classified into classes. Decision boundary can be linear or non-linear depending upon the dataset.

Say, for a binary classification problem we set the threshold value to 0.5, then all predicted probabilities ≥ 0.5 belong to one class and those which fall below the threshold belong to another class.

Implementation in python

Let’s implement the Binomial Logistic regression algorithm from scratch in python.

Binomial Logistic Regression Algorithm

Results:

Cancer Dataset:
Cost after 0 iterations is: 0.6931471805599447
Cost after 500 iterations is: 0.24196914662367217
Cost after 1000 iterations is: 0.21544802658515755
Cost after 1500 iterations is: 0.20258675514240815
Cost after 2000 iterations is: 0.19486411727087208
Cost after 2500 iterations is: 0.1896653529907702
Cost after 3000 iterations is: 0.1859065131539027
Cost after 3500 iterations is: 0.18305247729053967
Cost after 4000 iterations is: 0.18080480586334863
Cost after 4500 iterations is: 0.17898226127301103
Train Accuracy: 93.91%
Test Accuracy: 90.14%
Logistic Regression Model Coefficients (W): [-6.92534083e-03 -7.70592854e-03 -7.22792163e-03 -4.30796399e-02
-1.75008495e-02 -6.57910624e-05 8.15994214e-05 1.98609157e-04
8.43615554e-05 -1.18864322e-04 -5.56702438e-05 -4.64340265e-05
-6.98007851e-04 2.02622866e-04 1.38441865e-02 -4.69825267e-06
1.46825974e-05 1.64812745e-05 4.10301549e-06 -1.22344027e-05
-1.16889719e-06 -8.04041279e-03 -8.02590088e-03 -4.07644051e-02
2.58361086e-02 -7.29562348e-05 3.31741483e-04 4.95375685e-04
1.33476545e-04 -1.30881116e-04 -3.75699212e-05]
Logistic Regression Model Intercept (b): -0.001074620144465456

you can also get complete source code from my Github repo.

In order to implement the Multinomial Logistic Regression Model, the Softmax function should be used as a Logistic function.

Conclusion

Logistic Regression is a simple and efficient classification algorithm and provides a probability score for observations. Also, it doesn’t handle well for a large number of categorical features.

Thanks for reading :)

If you have any questions or suggestions regarding this article, please let me know. And, 💛 this if this was a good read.

Cheers !!

Happy Learning 😃

--

--

ranga_vamsi
The Startup

CS Grad | Data Science & Web Development Enthusiast