How to Filter a Pandas DataFrame With a Multi-Level Column Index: Stocks

Stephen McNeal
Geek Culture
Published in
2 min readJan 17, 2022

Filtering a DataFrame from yahoo finance with a multi-level column index.

Photo by Patrick Weissenberger on Unsplash

What is a Multi-Level Column Index?

A Multi-Level Column Index is a hierarchical way of defining dataframe column labels. A column is able to exist within multiple column labels.

Getting the Data

We’ll practice filtering a dataframe from Yahoo Finance. To do this we first need to retrieve the data.

We’ll retrieve data from Tesla and Apple for the first day of 2015 to the first day of 2022.

pip install yfinanceimport yfinance as yfdf = yf.download("TSLA AAPL", start="2015-01-01", end="2022-01-01",group_by="ticker")print(df)
Image by author.

This query using yfinance has given us a dataframe with two sets of column labels. The top column label defines the name of the stock ticker: AAPL and TSLA. The second column labels define the Open, High, Low, Close, Adjusted Close, and Volume for their respective top label.

Now for the filtering.

Lets say that we only want to see rows where Apple’s close price is above $100.

df2 = df[df.AAPL[‘Close’] > 100]
print(df2)
Image by author.

This line would read:

The dataframe where data under the “AAPL” column label in the “Close” column is greater than 100.

Notice that we used a “.” to navigate the first column label.

Now let’s try to filter ‘TSLA’ at the same time as we filter ‘AAPL’.

df3 = df[(df.AAPL['Close'] > 100) & (df.TSLA['Volume'] > 40000000)]print(df3)

This line would read:

The dataframe where:
Data under the “AAPL” column label in the “Close” column is greater than 100.
AND
Data under the “TSLA” column label in the “Volume” column is greater than 30000000.
Image by author.

Hopefully this tutorial saved you some time.

Thanks for reading and be sure to follow for more data science tips!

--

--