Funnel Enlightenment

A few steps you can take to get more out of funnel visualizations

Jay Sobel
Nightingale
6 min readMar 31, 2020

--

Until recently, I was the data analyst at Wanderu, the leading bus and train ticket aggregator/ecommerce site in North America. Last week I was laid off along with about half the team.

There were a few challenges, but the show stoppers were the coronavirus combined with a dramatic drop in oil prices — our true competitor is the automobile.

I’ll save my railing against the auto industry for another article (spoiler: the US is no. 3 in freight rail but no. 34 in passenger rail by modal share).

In this post I’m following up on Elijah Meeks’s We Live in a World of Funnels with some of the business intelligence solutions I developed at Wanderu. They are not fancy back-tracing loop-grappling flow diagrams, just humble improvements on the standard ecommerce funnel.

When I arrived at Wanderu we had something like this (and yes, I ripped that background color right off of the “The Italians” comic).

Definitely a funnel. Would be great for channeling a liquid into a small hole.

This is actually a visualization option in Looker. Maybe they were looking out for some analyst who was hacking this together with a stacked bar chart and a layer of fake transparent bars to float those business-blues into center court.

In any case, the first basic improvement we can make is to use an actual bar chart. I’m switching into Mode Analytics now because I’m unemployed and it’s a freemium BI tool with CSS editing on dashboards (so I can ace the Italian background color every time).

Just like Grandma used to make

By left-aligning the bars we get a much clearer sense of the successive fall-off of users. This version also allows us to show multiple sets of bars for comparison. We hate to see a dimension go unused, so I’ve split the data into two groups in an imaginary ecommerce split test.

Our test looks good because the yellow bars are longer in the middle, right? But why is the (literal) bottom line unchanged at 13%?

The problem with this chart is that we are still only indirectly showing the information that our reader — a product manager or executive — wants to know.

We can directly surface more useful information if we have a better understanding of why we are visualizing our ecommerce site as a funnel in the first place.

A linear funnel has two key attributes:

With all other things held constant …

  1. An X% increase at the top of the funnel (ex. in traffic) yields an X% increase in output (ex. revenue).
  2. An X% increase in advancement in the middle of the funnel yields an X% increase in output.

Concretely, if we have 1,000 units of traffic flowing through a three-step funnel and an average sale of $10 at the end, our funnel can be expressed as:

1000 * [Step-1 % adv] * [Step-2 % adv] * [Step-3 % adv] * $10

Our overall conversion rate (output units / input units) is those three middle steps multiplied together.

Conversion Rate = [Step-1 % adv] * [Step-2 % adv] * [Step-3 % adv]

Our basic funnel chart does not reflect these key attributes

The basic funnel shrinks deeper steps into smaller-and-smaller bars because the relative scaling of the chart is set for the total input into the funnel. In other words, the total input is always the denominator.

This also means that a 10% gain towards the end of the funnel is visually much smaller (fewer pixels) than a 10% gain near the top despite being equal in impact.

Another big issue is that changes in traffic are compounding. The shared denominator (initial traffic) means that+10% traffic arriving at Step 2 should lead to +10% traffic arriving in Step 3. We really want to know if that doesn’t happen because it means one of those [Step-X % adv] values has changed negatively — a valuable insight into the test’s performance.

Moving to Pairwise

In the next chart I’ve implemented a “Pairwise” funnel of the same data. It’s no longer a literal funnel, but it shares the underlying concept of our website split into a series of discrete linear steps. Then it sticks these sequential steps together into pairs.

The Y-axis is now the percent of users who advanced between each pair of sequential steps.

Notice that we can now see a negative change in the final stage of our funnel. This could have been assumed in earlier charts (up in middle, flat at bottom) but it was not visible.

The end of the funnel is no longer super small, and the gains in Pair 3 are not related to the gains in Pair 2 because they are using separate denominators.

There is still the problem of relative scaling. A 10% increase in Pair 1 and a 10% increase in Pair 2 are not equal in pixels, though they are much closer than the original attempt.

In order to precisely render that key attribute of % changes being equivalent in impact we need to make that impact the shared relative axis. The chart has a Brother.

Some day a BI tool will automatically highlight Y=0 on floating Y-axis charts and instantly be worth $100B.

Woo! This chart actually leverages the discrete linear nature of the “ecommerce funnel” to isolate changes between two groups and scale them so their visual size is equal to their impact in the conversion equation.

It’s also starting to reach an uncomfortable degree of abstraction. It shouldn’t be the first chart on a dashboard; it needs buddy-charts (like the preceding one) to provide more literal context and definitely an emotional-support big-number. In the right context, this chart can be a great go-to for A\B testing over a funnel. It creates the kind of confusion that encourages a deeper look rather than eroding trust or deflecting interest.

And look at those experiment numbers! Small gains throughout the funnel but a drop-off at checkout. It reads like an executive summary—we are seeing improvements mid funnel but need to work on checkout to lock-in those earlier gains. Is this influencing our abandon-cart campaign numbers?

Here are our two Pairwise Funnels side-by-side as they might be found in the wild. And a link to the dashboard in Mode if you’d like to read my SQL or steal my beautiful Italian High Chart CSS (the bare minimum to get the screengrabs).

I originally implemented these charts in Looker. I could tell you how I did it, but I’d have to kill you. Or my former employer would have to kill me? I don’t know, I should have read the NDA more closely.

The crux of doing this in Looker (without doing a whole new view) is to create a dimension with a bunch of fake case outputs that can be selected in tandem with ‘fill in missing values’ to generate rows (up to the # of cases or row limit).

Then you use their Excel-like Calculations to populate the pair labels into those rows and further Calculations to place the desired values (which you’ve selected the good old fashioned way) next to their labels.

Thanks for reading! Please hire me!

Credit to Susie Lu for these two comics that now inform my existence.

--

--

Jay Sobel
Nightingale

Data viz from a closet battle station in Boston, MA