Logistic Regression in Python from Scratch

Dhiraj K
Dhiraj K
Sep 5 · 3 min read

Let’s understand the basics of Logistic Regression

Image for post
Logistic Regression in Python From Scratch

Introduction:

When we are implementing Logistic Regression using sklearn, we are calling the sklearn’s methods and not implementing the algorithm from scratch.

In this article, I will be implementing a Logistic Regression model without relying on Python’s easy-to-use sklearn library. This post aims to discuss the fundamental mathematics and statistics behind a Logistic Regression model. I hope this will help us fully understand how Logistic Regression works in the background.

You may like to watch this article as a video, in more detail, as below:

General Terms:

Let us first discuss a few statistical concepts used in this post.

Sigmoid: A sigmoid function is an activation function. The output of the sigmoid function is always between a range of 0 to1.

Optimization: optimization is a process that maximizes or minimizes the variables or parameters of a machine learning model with respect to the selected loss function.

Import Libraries:

We are going to import NumPy and the pandas library.

import numpy as np
import pandas as pd

Load Data:

We will be using pandas to load the CSV data to a pandas data frame.

df = pd.read_csv('Classification-Data.csv')
df.head()
Image for post
Image for post
Logistic Regression from Scratch Data

Implementation:

Let us first separate the features and labels.

x = df[['Glucose','BloodPressure']]
y = df['Diabetes']

Then we need to define the sigmoid function.

def sigmoid(input):    
output = 1 / (1 + np.exp(-input))
return output

After that let us define the optimization function

def optimize(x, y,learning_rate,iterations,parameters): 
size = x.shape[0]
weight = parameters["weight"]
bias = parameters["bias"]
for i in range(iterations):
sigma = sigmoid(np.dot(x, weight) + bias)
loss = -1/size * np.sum(y * np.log(sigma)) + (1 - y) * np.log(1-sigma)
dW = 1/size * np.dot(x.T, (sigma - y))
db = 1/size * np.sum(sigma - y)
weight -= learning_rate * dW
bias -= learning_rate * db

parameters["weight"] = weight
parameters["bias"] = bias
return parameters

Then we need to initialize the parameters

init_parameters = {} 
init_parameters["weight"] = np.zeros(x.shape[1])
init_parameters["bias"] = 0

It is time to define the training function now

def train(x, y, learning_rate,iterations):
parameters_out = optimize(x, y, learning_rate, iterations ,init_parameters)
return parameters_out

Then we are ready for training the model.

parameters_out = train(x, y, learning_rate = 0.02, iterations = 500)

Finally, we will predict using the model.

Image for post
Image for post
Logistic Regression From Scratch — Model Training and Prediction

Endnotes:

In this article, I built a Logistic Regression model from scratch without using sklearn library. However, if you will compare it with sklearn’s implementation, it will give nearly the same result.

The code is uploaded to Github here.

Happy Coding !!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store