Excel Grouping & Aggregation

Simplify Excel Calculation&Automation
Analytics Vidhya
Published in
4 min readNov 12, 2020

I Common full division

The following 2019 product sales table records sales details of a product for every salesperson. The task is to list the top three salespeople in terms of sales amount.

Below is the Excel data:

The expected result:

Excel data table is useful in handling this task but it is difficult to use. It is useful because the feature can manage to get the aggregates, sort them, find the top three salespeople, and copy the relevant records as the desired result. But it’s inconvenient to handle all operations manually when the original data size is large or the number of to-be-calculated ranks is many.

SPL, however, handles the task in a stepwise way by conforming to the intuitive line of thinking:

After the script is executed, you just need to paste the result to Excel to get the expected table.

A SPL script interacts with Excel through the clipboard. It copies & pastes the to-be-analyzed data to the clipboard, edits the script in esProc designer and executes it, and then pastes the result to Excel.

II Grouping with “other records” combined

Sometimes it’s a part of the original detailed data that we are concerned with, and we just summarize all the other data. Based on Excel data in the previous example, we want to calculate the total sales of each kind of fruits and summarize the sales quantities of all vegetables.

The expected result:

The grouping basis is too specific to be handled using the pivot table. You have to write a macro with VBA to use the program to group data according to specific values. It’s complicated and difficult.

SPL handles such type of grouping using enumerated values:

Then past the clipboard content to Excel to get the desired result.

III Alignment grouping by supplying missing values

Pivot tables will not create groups for values that are missing in the grouping basis. When we group the sales table where there are no sales records for some months by date using pivot tables, we get the following result:

Since certain months don’t have corresponding data, there are no groups for them. The result is not regular. But what we want is this:

Pivot tables are not able to achieve this. And it’s also hard to do this in VBA.

SPL handles this type of by defining a grouping basis:

IV Grouping by intervals

Below is the student score table. The grading scale is like this:≤60 is a C, >60 & < 90 is a B, and ≥90 is an A. The task is to count the students in each level.

The expected result:

After you define the score intervals, you can use pseg function to divide the student scores and then you perform count on each group.

SPL script:

V Grouping by overlapping conditions

The following table lists the GDP of some countries in 2019. The task is to calculate the average GDP for developed countries, developing countries, and the BRICS countries.

2019 GDP figures of some countries (Unit: $billion):

The expected result:

The developing countries and BRICS countries share common values. To handle grouping tasks with overlapping conditions, SPL offers enum function to define enumerated conditions in a flexible way.

SPL script:

--

--

Simplify Excel Calculation&Automation
Analytics Vidhya

Products and resources that simplify hard data processing tasks. If you have any questions, send me a message. https://www.linkedin.com/in/witness998