# Logistic Regression : All You wanna Know

Complete implementation from scratch

Logistic Regression is used to model probability of a certain class so that it can be assigned value of 0 or 1. It is a statistical and supervised learning model. It is a go-to method for binary classification.

So basically it is a classification technique. Here we will see Binary Classification.

We use Sigmoid Function in Logistic Regression as we want our value to be between 0 and 1 which looks like in below image.

# Hypothesis Function

Combining both will give us →

This function will always give us value between 0 and 1 which tells us the probability of that point. As we come near to predicted line our predictions become less confident ( as they are confused). We produce a decision boundary which is if value is greater than 0.5 then it is 1 and if less than 0.5 it is class 0.

# Log Loss (Binary Cross Entropy)

Here you can see correct value is being multiplied by the predicted value. The first part of the formula explains when label is 1 multiplied by confidence of point being positive. Similarly second part stands when label is 0 multiplied by it’s confidence of being negative.

We have not taken negative sign while calculating derivative in above photo and this is only difference between linear regression and logistic regression’s gradient descent formula.

Now we are all set to head to implement this →

`def sigmoid(x):    return 1.0/(1.0 + np.exp(-x))def hypothesis(X,theta):    # X - entire array (m,n+1)    # theta - np.array(n+1,1)    return sigmoid(np.dot(X,theta))def error(X,y,theta):    """    parameters:    X - (m,n+1)    Y - (m,1)    theta - (n+1,1)    return scalar value of loss    """    hi = hypothesis(X,theta)    e = -1*np.mean(y*np.log(hi)+(1-y)*np.log(1-hi))    return edef gradient(X,y,theta):    """    parameters:    X - (m,n+1)    Y - (m,1)    theta - (n+1,1)    return vector    """    hi = hypothesis(X,theta)    grad = np.dot(X.T,(y-hi))    return grad/X.shapedef gradient_descent(X,y,lr=0.1,max_itr=500):    n = X.shape    theta = np.zeros((n,1))        error_list = []    for i in range(max_itr):        err = error(X,y,theta)        error_list.append(err)                grad = gradient(X,y,theta)                theta = theta + lr*grad    return theta ,error_listdef predict(X,theta):    h = hypothesis(X,theta)    output = np.zeros(h.shape)    output[h>=0.5] = 1    output = output.astype('int')    return output`

Now just apply logistic regression to any of the binary models like diabetes classification or you can import breast_cancer dataset from sklearn and then apply this!!

--

--

## More from Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com