Signal Processing for Scientific data analysis with Python: Part 3

Median filter to remove outliers

Nita Ghosh
Analytics Vidhya

--

Image by Altuna Akalin

In this third part of signal processing with Python, I’d discuss use of median filter to remove large spiked signals. Here are links for the first and second parts. First, let’s clarify once again, what is the difference between mean and and median of a series.

To compute the median of a set of numbers, we need to

Median

  1. Sort the numbers in ascending order
  2. a. If there are odd number of values is the list, median is just the middle number, b. If there are even number of values in the list, the median is the average of the two middle values

Now, consider a list of numbers. As you can see, most of these numbers in this particular list are below 50, whereas very few numbers are really large. The mean of these numbers is

79.54545454545455

whereas the median value is

27.0

So the mean value of the set is strongly pulled up by few larger values in the list which may be outliers (non-representative data points). Contrary to this the median is largely unaffected by the presence of outliers.

--

--