Selecting Subsets of Data in Pandas: Part 2

Published in

Dunder Data

20 min readDec 8, 2017

Part Two: Boolean Indexing

This is part two of a four-part series on how to select subsets of data from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection which necessitates multiple articles. This series is broken down into the following four topics.

Learn More

Master Data Analysis with Python is an extremely comprehensive text with over 80 chapters, 500 exercises, and video lessons to help you become an expert.

Part 1 vs Part 2 subset selection

Part 1 of this series covered subset selection with [], .loc and .iloc. All three of these indexers use either the row/column labels or their integer location to make selections. The actual data of the Series/DataFrame is not used at all during the selection.

In Part 2 of this series, on boolean indexing, we will select subsets of data based on the actual values of the data in the Series/DataFrame and NOT on their row/column labels or integer locations.

Documentation on boolean selection

I always recommend reading the official documentation in addition to this tutorial when learning about boolean…

Selecting Subsets of Data in Pandas: Part 2

Part Two: Boolean Indexing

Learn More

Part 1 vs Part 2 subset selection

Documentation on boolean selection

Written by Ted Petrou