Fundamentals of Logistic Regression
Here we’ll learn binary classification using logistic regression
This is second part of series on Logistic Regression:
- Fundamental of Logistic Regression
- Coding Logistic Regression in Python From Scratch
The logistic regression is a supervised learning system borrowed from the concepts of statistics. The name “Logistic” is taken from the Logistic Function also called the sigmoid function.
Clearly this function ranges (0 to 1), the fact itself gives an intuition of its application in binary classification.
Hypothesis
In machine learning, a hypothesis is an equation which is used to predict the output taken the given input. The hypothesis equation, gives the probability for Y=1. It is not supposed to be accurate, until we apply learning algorithms to improve the hypothesis.
Where,
W → weight matrix
b → coefficient matrix
X →input matrix
Decision Boundary
Z is the decision boundary of the hypothesis.
Predict y = 1, for Z ≥ 0
And y = 0, for Z <1
Cost Function
Cost Function is a function that measures the performance of a Machine Learning model for given data. Cost Function quantifies the error between predicted values and expected values and presents it in the form of a single real number.
Where,
Y → Output
H → Hypothesis Output
m → Number of Samples
Cost Function score is more like a performance report of the model. Which needs to be improve by optimization.
Gradient Descent Optimization Algorithm
The above graph is cost vs weight curve. As we move towards the ideal weight, the cost will obviously decrease and as we exceed the ideal weight, the cost will actually increase. Surely we want our hypothesis to reach the minima.
The idea behind the gradient descent algorithm, is to change the weights to proportional to the gradient. If its more steep, the change will be significant, and for gradual curve, change will be gradual.
The Equations gradient descent is:
The values of these gradient can be found out by calculating some derivatives.
Where Alpha → learning rate.
This procedure has to be repeated for number of iteration, as the descent takes place in small steps.
The learning rate value is also important for optimizing the model learning time and accuracy. Set the learning rate too low, it will slow down the training, set the learning rate too high, it won’t reach the minima at all.
Prediction
Once the model is made and hypothesis is optimized, we need to predict the outputs for test sample.
We know that hypothesis gives the prediction of Y = 1. Hence it is safe to say that:
H≥0.5 → P=1
H<0.5 → P=0
Hope you enjoyed and learned the concepts of Logistic Regression. I’d like to express my gratitude to Andrew Ng, from where I learnt these concept at the first place.
If you have stayed till the end — Do Clap. It’ll keep me motivated to write more. Thank You.