Pivot Tables
Why Pivots?
Imagine when we are asked to find the total price or cost of a product over multiple periods of time. We can do it two ways in Einstein Analytics:
- Using Compare Table: This will mean that we create multiple columns filtered down to each time period. And then if we are using absolute dates, then we will have to edit it every-time there is an increment in a month, quarter, year so on and so forth! Tough to maintain. Right? Also, be warned that if we have a calculated field we are using in this table with filters (say multiple units and each product has to be associated with a specific unit), the filters will no longer be respected in such cases. :(
- Using Pivot Table: This is easy! Start with a compare table with the Product and time period (no filtering of the time period for each column!), perform all the calculations we want and then turn it into a Pivot! Voila — we have the same result as with the Compare Table with no requirement to add new columns every now and then when a new month or quarter is added! And no problems with calculated fields and filters as mentioned in Compare Table any more! Win-Win!!
But what is a Pivot Table? Wikipedia explains the Pivot Table as:
A pivot table is a table of statistics that summarizes the data of a more extensive table (such as from a database, spreadsheet, or business intelligence program). This summary might include sums, averages, or other statistics, which the pivot table groups together in a meaningful way. Pivot tables are a technique in data processing. They enable a person to arrange and rearrange (or “pivot”) statistics in order to draw attention to useful information.
If we go further down on the same page we will find an example similar to what I was explaining above.
In this piece, we will show you how easy it is to Pivot Table in such cases. I personally love Compare Table in most cases and that should be the ideal start point for most lens and dashboards because of the flexibility it provides. But not when we can get what we want in fewer and easier steps.
What problem are we solving?
The Data:
We are using Taxi Data from Chicago (Hi from Chicago!). This city like many others in the US and Europe have Open Data policy. We will be using a very small part of the data from Jan 2017 to Mar 2017. For those interested to know the volume of this data, it is 1.8M+ and hence had to limit it to 3 months! I didn’t want to take a very small (‘000s) of rows and neither did I want to get into 10M. I think the data I have now should be good enough to explain and understand the problems we want to solve in Einstein Analytics for most cases.
Let’s try to find how many taxi rides where taken each week (12) for each neighborhood in Chicago. Simple. Right?
Solution
We will use a Pivot Table as mentioned earlier! I used the dataset creator in Einstein Analytics and uploaded the Taxi Data csv file. After uploading, the dataset details look like this:
Steps:
- Open a new lens with the Taxi_Trips dataset.
- Now from the initial bar chart view which shows the total count of rows, change it to a Compare Table.
- And that now gives us just Count of Rows and just 1 row and 1 column.
- Group the data by the dimensions and date fields we want. In this case, it is Pick Up Community Area and Trip Start Time (Year-Week). Why Year-Week and not just Week? In my case, it doesn’t matter but if we have multiple years then it will calculate all data for all years for that particular week. Not what we want. And we are leaving the count of rows as is because we want to find the number of rides per week.
- Now, all we have do is go back to the Table Mode on the top right of the lens and change it to Pivot Table and there we have it!
- Note that whatever is the last group by will be transposed to the columns. We have added Dropoff Community Area to the above we can see that Now the columns are Dropoff Community Area instead of Trip Start Time.
- Now we re-arranged the Group By so that Trip Start Time is the last group by. This change makes the Trip Start Time as the columns. This means that whatever is the last group by will be the column in the Pivot Table
Final Note
We used a very simple example here to show how Pivot Table works and we can see that getting the same result using Compare Table will be more work and difficult (very possible though). If we have calculated fields we would like to see in the Pivot Table, use Compare Table to perform all our calculations and then convert it into a Pivot.
Summer ‘20
Since Summer ’20 release, we can have more than one measure for pivot tables. These columns respect the individual column filters and we can now apply conditional formatting on these columns.