📈Python for finance series

Identifying Outliers

How to find and visualize outliers in your dataset by Pandas

Ke Gui
The Startup
Published in
4 min readJul 25, 2020

--

Photo by Dave Gandy

Warning: There is no magical formula or Holy Grail here, though a new world might open the door for you.

📈Python For Finance Series

  1. Identifying Outliers
  2. Identifying Outliers — Part Two
  3. Identifying Outliers — Part Three
  4. Stylized Facts
  5. Feature Engineering & Feature Selection
  6. Data Transformation
  7. Fractionally Differentiated Features
  8. Data Labelling
  9. Meta-labeling and Stacking

In Part one and Part two, I introduced the mean and standard deviation (std) to set the outliers boundary. Here we are going to use Exponential Moving Average (EMA) as the boundary. The method is the same as before, the difference lays in the way of calculating the EMA mean and std.

1. Data preparation

Again we reuse the same dataset as before. The following code snippet will get you started. If you have been following these series along, you can ignore this part and jump to the next section straight away.

--

--

Ke Gui
The Startup

An ordinary guy who wants to be the reason someone believes in the goodness of people. He is living at Brisbane, Australia, with a lovely backyard.