Automatically create a pivot table in Redshift

John Mastro
Feb 20, 2019 · 3 min read

The open-source repo for this tool can be found here.

Creating pivot tables is a relatively common need, as seen by its popularity in software like Microsoft Excel and Google Sheets. We built a small tool to create pivoted database tables. It can be used as a standalone tool or integrated into ETL pipelines.

Here’s an example of what a simple pivot table transformation might look like:

Image for post
Image for post


At Ro, our business and operations are growing quickly in terms of both scale and complexity. We launched at the end of October 2017 with a single brand (Roman) treating a single condition (erectile dysfunction), but in the last few weeks we’ve added a new brand (Zero, for smoking cessation) and several conditions under Roman (hair loss, cold sores, genital herpes, and premature ejaculation), with more on the way.

For the data team, this means our data warehouse needs to support business units and analysts who want to look at data through a variety of lenses. They may want to focus on the overall company, on a particular medical condition, a particular brand, or some cross-cutting category (all prescription medicines vs OTC products). Pivot tables are one important tool for making it but easy and efficient for them to ask and answer the questions they’re interested in.

For context, our data warehouse is in Amazon Redshift, and our solution is currently Redshift-specific (though it would be straightforward to adapt it to most other databases).

The usual way

At the database level, the most common way to write pivot queries is probably to hand-write chains of CASE statements. This is unsatisfactory because it’s labor-intensive, error-prone, and needs to be updated by hand every time a new column is added or a new case becomes possible.

Alternatively, pivoting may be done outside the database, using a spreadsheet program or a business intelligence tool like Looker. This doesn’t help if it’s important to have the pivoted data in the database, whether for performance reasons or because it’s used further in your data pipeline.

An alternative

We built a simple way to generate pivot tables (or just pivot queries), which we’ve posted on GitHub here. The README discusses how to use it, including some examples.

The implementation strategy is to take the list of pivot columns, find every existing combination of values in those columns, and then generate appropriate pivoted columns. An aggregation function and default value can be specified individually for each pivoted column, or the script will attempt to guess reasonable defaults based on the column’s name (i.e. SUM will be used for a column whose name starts with `total_`).

The result is that we can declaratively specify pivot tables and have them automatically regenerated. The changes necessary to support new columns are either non-existent or minimal and localized.

Other alternatives

Spreadsheets and business intelligence tools can create pivot tables, but that’s not always a replacement for having an in-database pivot table.

Some databases include related functionality (like PostgreSQL’s crosstab table function or Microsoft SQL Server’s PIVOT feature).

The popular tool dbt also includes a macro for creating pivot tables in the dbt-utils package.

Ro Data Team Blog

Ro Data Team Blog: data analytics, data engineering, data…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store