Machine Learning 101

Part 5: Logistic Regression

Bzubeda
3 min readDec 15, 2023

In the previous part — Part 4: Linear Regression, we learned about Linear Regression, its working, and different assumptions made by the algorithm using an example.

Let’s take an example to understand what is Logistic Regression, how it works, and the different types of Logistic Regression.

Logistic Regression is a Supervised Learning algorithm that solves classification problems using Sigmoid (for binary category classification) or Softmax (for multiple category classification) mathematical equations. It tells us the probability of the event occurring (Yes or No) ranging between 0 and 1 where the values closer to 0 indicate the least probability, and the values closer to 1 indicate the highest probability of the event occurring.

Sigmoid Mathematical equation:

Image Source — Sigmoid function

Softmax Mathematical equation:

Image Source — Softmax function

Suppose, we need to predict whether the patient is diagnosed with Diabetes or not. Let’s assume we have features such as the number of pregnancies, insulin, glucose, blood pressure, skin thickness, and BMI that are used to predict the target variable Diabetes.

In this case, where the target variable is categorical with only two possibilities (Yes and No), we can use Logistic Regression as our Machine Learning algorithm. Here, the Machine Learning model learns from the previous Diabetic and Non-diabetic case data and gives the prediction.

If we get the target output as a probability of 0.80, we can say that there is an 80% probability of the patient being diagnosed with Diabetes. If we get a probability of 0.20, we can say that there is only a 20% probability of the patient being diagnosed with Diabetes.

The total sum of probabilities (eg. Yes=0.80 + No=0.20) is always equal to 1.

Image Source — Sigmoid Curve

Here y is the target output Yes or No.

Note: Linear Regression and Logistic Regression may have Regression in their names but are used for two completely different purposes. Linear Regression is used to predict numerical value that cannot be separated into finite categories and Logistic Regression is used to predict categorical targets.

There are 3 Types of Logistic Regression

1) Binomial Logistic Regression — It is used for binary classification problems with only two possible targets (Yes or No, Pass or Fail), as we have seen in the above example.

2) Multinomial Logistic Regression — It is used for classification problems with multiple unordered possible targets. For example, predicting whether the image consists of a cat, dog, or horse.

3) Ordinal Logistic Regression — It is used for classification problems with multiple ordered possible targets. For example, predicting sugar levels in diabetes such as low, medium, or high.

Now, another important question is, When it is best to use Logistic Regression?

Logistic Regression is usually preferred for binary classification problems, where the size of the dataset is small to moderate. It also assumes Linearity.

Stay tuned, in the next part we will understand what is Decision Tree and how it works. Please share your views, thoughts, and comments below. Feel free to ask any queries.

References:

--

--