Minimally Sufficient Pandas

Published in

Dunder Data

28 min readJan 30, 2019

In this article, I will offer an opinionated perspective on how to best use the Pandas library for data analysis. My objective is to argue that only a small subset of the library is sufficient to complete nearly all of the data analysis tasks that one will encounter. This minimally sufficient subset of the library will benefit both beginners and professionals using Pandas. Not everyone will agree with the suggestions I lay forward, but they are how I teach and how I use the library myself. If you disagree or have any of your own suggestions, please leave them in the comments below.

By the end of this article you will:

Know why limiting Pandas to a small subset will keep your focus on the actual data analysis and not on the syntax
Have specific guidelines for taking a single approach to completing a variety of common data analysis tasks with Pandas

Learn More

Master Data Analysis with Python is an extremely comprehensive text with over 80 chapters, 500 exercises, and video lessons to help you become an expert.

Pandas is Powerful but Difficult to use

Pandas is the most popular Python library for doing data analysis. While it does offer quite a lot of functionality, it is also regarded as a fairly difficult library to learn well. Some reasons for this include:

There are often multiple ways to complete common tasks

Minimally Sufficient Pandas

Learn More

Pandas is Powerful but Difficult to use

Written by Ted Petrou