How to add subtotals to multiple rows of a pivot table with pandas

Paul Vella
Ones & Zeroes
Published in
4 min readSep 13, 2020

--

I’ve been using Python’s pandas library to summarise large datasets. And while it can easily summarise the data, I am missing totals in my tables. The totals are useful when examining the data to understand what proportion each split of the data represents.

For example, consider this output:

Summary statistics for data split by 3 dimensions

What if I want to see the totals for each item in ‘food’, each item in ‘name’ and each item in ‘city’? The pivot table does not include these, so we have to add them ourselves.

In this post I’ll show you how I managed to add subtotals.

Here’s my very small dataset for training purposes:

I know, it’s inherently meaningless data, but once you know how to add subtotals, you can easily apply them to very large datasets.

Run the initial pivot table by all dimensions

The first step is to call pandas and read the csv data into Python:

Next, I run a pivot table that calculates as many descriptive statistics about the ‘rating’ variable as I desire: count, unique count, sum, min, max, average, standard deviation, standard error, median, MAD, variance, and skewness. We probably only really care about count, sum, and average but having all the…

--

--