Member-only story
Outlier Detection (Part 2)
Adjusted Boxplot for skewed distribution
Introduction
In the pervious article, I have discussed about outlier detection procedures for mostly normal distributions. The procedures include
- IQR (Inter-Quartile Range)
- Standard deviation
- Z-score
- Modified Z-score
We have gone through the boxplots after these abovementioned procedures are implemented and shown the number of outliers in each case. In real datasets, the distribution is not always normal. Oftentimes, they are skewed and contain unwanted extreme values. In this article, I will go through the outlier detection procedure for skewed distributions and adjust the boxplots accordingly.
Skewed Distribution
The IQR method as well standard deviation, Z-score and modified Z-score work very good normal or nearly normal type distribution. However, majority of the real work data is not normal and oftentimes are skewed. That means the data can have tail either on low side or high side of the distribution. These type of skewed distributions are shown below. The corresponding boxplot can also become asymmetric for skewed data as well.