# Logistic Regression : All You wanna Know

Complete implementation from scratch

Logistic Regression is used to model probability of a certain class so that it can be assigned value of 0 or 1. It is a statistical and supervised learning model. It is a go-to method for binary classification.

For instance you want to predict whether is person is having diabetes or not !

Let’s say you want to know the survival rate from any datasetORAny mail is a spam (1) or not (0).OR

So basically it is a classification technique. Here we will see Binary Classification.

We use **Sigmoid Function **in Logistic Regression as we want our value to be between 0 and 1 which looks like in below image.

# Hypothesis Function

Combining both will give us →

This function will always give us value between 0 and 1 which tells us the probability of that point. As we come near to predicted line our predictions become less confident ( as they are confused). We produce a decision boundary which is if value is greater than 0.5 then it is 1 and if less than 0.5 it is class 0.

# Log Loss (Binary Cross Entropy)

Loss In Logistic Regression

Here you can see correct value is being multiplied by the predicted value. The first part of the formula explains when label is 1 multiplied by confidence of point being positive. Similarly second part stands when label is 0 multiplied by it’s confidence of being negative.

We have to minimize this loss function for this we will use gradient descent as we have in Linear Regression.

Linear Regression Blog Link

# Gradient Descent

We have not taken negative sign while calculating derivative in above photo and this is only difference between linear regression and logistic regression’s gradient descent formula.

Now we are all set to head to implement this →

defsigmoid(x):

return 1.0/(1.0 + np.exp(-x))defhypothesis(X,theta):

# X - entire array (m,n+1)

# theta - np.array(n+1,1)

return sigmoid(np.dot(X,theta))deferror(X,y,theta):

"""

parameters:

X - (m,n+1)

Y - (m,1)

theta - (n+1,1)

return scalar value of loss

"""

hi = hypothesis(X,theta)

e = -1*np.mean(y*np.log(hi)+(1-y)*np.log(1-hi))

return edefgradient(X,y,theta):

"""

parameters:

X - (m,n+1)

Y - (m,1)

theta - (n+1,1)

return vector

"""

hi = hypothesis(X,theta)

grad = np.dot(X.T,(y-hi))

return grad/X.shape[0]defgradient_descent(X,y,lr=0.1,max_itr=500):

n = X.shape[1]

theta = np.zeros((n,1))

error_list = []

for i in range(max_itr):

err = error(X,y,theta)

error_list.append(err)

grad = gradient(X,y,theta)

theta = theta + lr*grad

return theta ,error_listdefpredict(X,theta):

h = hypothesis(X,theta)

output = np.zeros(h.shape)

output[h>=0.5] = 1

output = output.astype('int')

return output

Now just apply logistic regression to any of the binary models like diabetes classification or you can import breast_cancer dataset from sklearn and then apply this!!