Pandas Query: the easiest way to filter data

Gustavo Santos
gustavorsantos
Published in
4 min readJul 16, 2021

--

Learn 9 code snippets that will enhance your productivity.

Photo by Nathan Dumlao on Unsplash

Pandas Query has been around for quite some time now, however it was earlier this year when I started to hear more about that function.

Put simply, df.query() will take a boolean expression (True/False output) and will match it with each row of your dataset, returning only those that are True.

Pandas Query evaluates an expression and returns the dataframe filtered with the True values.

The Syntax

It is a simple syntax, what helps a lot with code readability and also breaks barriers for beginners. Remember to always write the expression to be evaluated within quotes ‘expression’.

df.query('expression')

Before jumping to the code snippets, let’s create a dataframe.

# Dataframe
df = pd.DataFrame( np.random.randint(1,10, size=(5,5) ), columns=list('ABCDE') )
# Insert a column "Name"
df.insert(0,'Name',['Jack','John', 'Paul', 'Jack', 'Paul'])
df

--

--

gustavorsantos
gustavorsantos

Published in gustavorsantos

Let’s make better decisions. Data driven decisions. I use Python, R, Excel and SQL to create Data Science projects.

Gustavo Santos
Gustavo Santos

Written by Gustavo Santos

Data Scientist. I extract insights from data to help people and companies to make better and data driven decisions. | In: https://www.linkedin.com/in/gurezende/