Essential guide to handle Outliers for your Logistic Regression Model

Improve the model performance by removing the outliers

Published in

Geek Culture

5 min readAug 25, 2021

A real-world dataset often contains a lot of missing values and outliers data points. The cause of the outliers can be data corruption, measurement/experimental errors, or human errors. The presence of outliers in the dataset impacts the model to great extent. Handling outliers is an essential component of the feature engineering pipeline.

There are various techniques to handle the outliers present in the dataset including the interquartile range method, box plots, z-score, and many more.

Read this article to know how to detect outliers for an unsupervised dataset using Silhouette Analysis.

Handling Outliers in Clusters using Silhouette Analysis

Identify and remove outliers in each cluster from K-Means clustering

towardsdatascience.com

Logistic Regression models are not much impacted due to the presence of outliers because the sigmoid function tapers the outliers. But the presence of extreme outliers may somehow affect the performance of the model and lowering the performance. In this article, we will discuss how to improve the performance of…

Essential guide to handle Outliers for your Logistic Regression Model

Improve the model performance by removing the outliers

Handling Outliers in Clusters using Silhouette Analysis

Identify and remove outliers in each cluster from K-Means clustering

Written by Satyam Kumar