Implementing Logistic Regression with SGD From Scratch

Harshwardhan Jadhav
Analytics Vidhya
Published in
5 min readFeb 1, 2021

Custom implementation of Logistic Regression in python.

https://dimensionless.in/logistic-regression-concept-application/

Hello everyone,

As you have stopped here just out of curiosity looking at the title Logistic Regression, I am going to feed your curiosity about the same in this whole article. After reading this article you will be feeling confident about logistic regression, and you will be able to implement it on your own and of-course you will be able to explain it to anyone. Without wasting a bit of your time I will start feeding your curiosity slowly, just keep reading.

You must have heard about Logistic Regression already, it is the most famous Machine Learning algorithm anyway. Logistic Regression is a Machine Learning algorithm which is used for solving Classification tasks. Yes even its name consists of “regression” term it is a Classification algorithm and not a Regression algorithm, which basically means we use this algorithm when we have a task of classifying an image e.g. into two classes like if it belongs to a Cat or a Dog.

Okay, we now have some idea what Logistic Regression is, another popular thing about LR is it is mostly used for binary classification problems i.e. problems with two classes. But this can be extended to multi class classification problem too. Logistic Regression assumes that the data points which we are going to use for training are almost or perfectly linearly separable. Look at the following figure, we have to find that green line which is separating the blue stars and orange circles. Keep in mind that it can be a line in 2-D space or a plane in 3-D space.

Just like the linear regression here in logistic regression we try to find the slope and the intercept term. Hence, the equation of the plane/line is similar here.

y = mx + c

But the way we use this equation in case of logistic regression is different instead of fitting the line to the points, we try to find the line/plane which separates the two types of data-points correctly. If we consider blue stars in the above graph as 1 and orange circles as 0, we have to predict the data point belongs to either 0 or 1. So for this purpose there is something called Sigmoid Function, such a fancy name. We apply Sigmoid function on our equation y=mx + ci.e. Sigmoid(y=mx + c), this is what Logistic Regression at its core is. But what is this sigmoid function doing inside, lets see that,

here, z = mx+c

We use this function to predict the value belongs to either class 0 or class 1. But before that we need generalized values of ‘m’ and ‘c’, to perform predictions on new data points. For this purpose we use an optimization algorithm to find the optimum values of ‘m’ and ‘c’. We are going to use Stochastic Gradient Descent (SGD) algorithm to perform optimization. If you don’t have much exposure to Gradient Descent click here to read about it.

First, lets define our loss function,

here, ‘Ytrue’ is true value and ‘Ypred’ is predicted value,

Ypred = sigmoid(mx+c)

Now, we differentiate this loss function with respect to the parameters we want to optimize. So here we have to differentiate w.r.t. ‘m’ and ‘c’.

Done, the most important requirements are now fulfilled. Let’s start coding for implementing above equations into python code,

# first let's import required libraries
import numpy as np # for mathematical operations
X = # data points with some features which we want to train
y = # labels of all datapoints
# Initialize the weights and bias i.e. 'm' and 'c'
m = np.zeros_like(X[0]) # array with shape equal to no. of features
c = 0
LR = 0.0001 # The learning Rate
epochs = 50 # no. of iterations for optimization
# Define sigmoid function
def sigmoid(z):
sig = 1/(1+np.exp(-z))
return sig
# Performing Gradient Descent Optimization
# for every epoch
for epoch in range(1,epochs+1):
# for every data point(X_train,y_train)
for i in range(len(X)):
#compute gradient w.r.t 'm'
gr_wrt_m = X[i] * (y[i] - sigmoid(np.dot(m.T, X[i]) + c))

#compute gradient w.r.t 'c'
gr_wrt_c = y[i] - sigmoid(np.dot(m.T, X[i]) + c)
#update m, c
m = m - LR * gr_wrt_m
c = c - LR * gr_wrt_c
# At the end of all epochs we will be having optimum values of 'm' and 'c'
# So by using those optimum values of 'm' and 'c' we can perform predictions
predictions = []for i in range(len(X)):
z = np.dot(m, X[i]) + c
y_pred = sigmoid(z)
if y_pred>=0.5:
predictions.append(1)
else:
predictions.append(0)
# 'predictions' list will contain all the predicted class labels using optimum 'm' and 'c'

This is how we implement the Logistic Regression from scratch using python. I hope you must have enjoyed this article and must be excited to try this by yourself, so without wasting time go and implement your very own Logistic Regression Classifier and use it to perform prediction tasks.

Thank you so much making to the end, See you in the next article, till then have good time, keep learning.

--

--

Analytics Vidhya
Analytics Vidhya

Published in Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Harshwardhan Jadhav
Harshwardhan Jadhav

Written by Harshwardhan Jadhav

Data Scientist | Mechanical Engineer

No responses yet