Analytics Vidhya
Published in

Analytics Vidhya

Logistic Regression : All You wanna Know

Complete implementation from scratch

Logistic Regression is used to model probability of a certain class so that it can be assigned value of 0 or 1. It is a statistical and supervised learning model. It is a go-to method for binary classification.

So basically it is a classification technique. Here we will see Binary Classification.

We use Sigmoid Function in Logistic Regression as we want our value to be between 0 and 1 which looks like in below image.

Hypothesis Function

Combining both will give us →

This function will always give us value between 0 and 1 which tells us the probability of that point. As we come near to predicted line our predictions become less confident ( as they are confused). We produce a decision boundary which is if value is greater than 0.5 then it is 1 and if less than 0.5 it is class 0.

Log Loss (Binary Cross Entropy)

Here you can see correct value is being multiplied by the predicted value. The first part of the formula explains when label is 1 multiplied by confidence of point being positive. Similarly second part stands when label is 0 multiplied by it’s confidence of being negative.

Linear Regression Blog Link

Gradient Descent

We have not taken negative sign while calculating derivative in above photo and this is only difference between linear regression and logistic regression’s gradient descent formula.

Now we are all set to head to implement this →

def sigmoid(x):
return 1.0/(1.0 + np.exp(-x))
def hypothesis(X,theta):
# X - entire array (m,n+1)
# theta - np.array(n+1,1)

return sigmoid(np.dot(X,theta))
def error(X,y,theta):
"""
parameters:
X - (m,n+1)
Y - (m,1)
theta - (n+1,1)
return scalar value of loss
"""

hi = hypothesis(X,theta)
e = -1*np.mean(y*np.log(hi)+(1-y)*np.log(1-hi))
return e
def gradient(X,y,theta):
"""
parameters:
X - (m,n+1)
Y - (m,1)
theta - (n+1,1)
return vector
"""

hi = hypothesis(X,theta)
grad = np.dot(X.T,(y-hi))
return grad/X.shape[0]
def gradient_descent(X,y,lr=0.1,max_itr=500):
n = X.shape[1]
theta = np.zeros((n,1))

error_list = []
for i in range(max_itr):
err = error(X,y,theta)
error_list.append(err)

grad = gradient(X,y,theta)

theta = theta + lr*grad
return theta ,error_list
def predict(X,theta):
h = hypothesis(X,theta)
output = np.zeros(h.shape)
output[h>=0.5] = 1
output = output.astype('int')
return output

Now just apply logistic regression to any of the binary models like diabetes classification or you can import breast_cancer dataset from sklearn and then apply this!!

--

--

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store