Using Waterfall Charts in Python to analyze Programmatic Advertising Revenue change.

3 min readSep 7, 2021

When presenting to an executive audience, one of the most powerful ways to provide insights into what drives change in year-over-year revenue performance is the Waterfall (aka Bridge) Chart.

When building Waterfall Charts for ad revenue generated on news media websites, there were three key considerations:

1.) What are the key drivers behind year-over-year (YoY) revenue performance? In this case, it was Page Views (PV), Impressions per PV (Imp/PV), and CPM. It was also important to break out by device and content type.

2.) Finding the right equation to compare year-over-year change.

Total YoY Revenue change: rev change driven by PV + rev change driven by Imp/PV + revenue change driven by CPM (which ties.)
rev change driven by PV: (PV ty — PV last year) * last year revenue per PV
rev change driven by Imp/PV: (Imp/PV this year — Imp/PV last year) * CPM last year * PV this year/1000
rev change driven by CPM: (CPM this year — CPM last year) * impressions this year/1000

3.) Will this process need to be easily reproducible? As is often the case, the answer was yes! This report would need to be generated on a weekly basis and would need to be run across several websites. Hence, we went to Python Pandas and Plot.ly for a solution.

Coding

Coding the formulas for consideration 2 above:

#Assign calculated columns
    grpd['RPM'] = grpd.revenue/grpd.page_views*1000
    grpd['CPM'] = grpd.revenue/grpd.filled_impressions*1000
    grpd['imp/pv'] = grpd.filled_impressions/grpd.page_views#Calculate waterfall chart values
        pv_impact = (segment_ty.page_views.values - segment_ly.page_views.values)*segment_ly.RPM.values/1000
        impperpv_impact = (segment_ty['imp/pv'].values - segment_ly['imp/pv'].values)*segment_ly.CPM.values * segment_ty.page_views/1000
        cpm_impact = (segment_ty.CPM.values - segment_ly.CPM.values)*segment_ty.filled_impressions.values/1000
        chart_data[group] = [float(pv_impact), float(impperpv_impact), float(cpm_impact)]

Iteratively generate Plotly charts by site:

# Push data into plotly formatfor market in data.market.unique():
    #Create df for chart building
    subset = data[data.market==market]
    subset = subset[['year','device','is_slideshow','page_views','filled_impressions','revenue']]
    grpd = subset.groupby(['year','device','is_slideshow'], as_index=False)
    grpd = grpd.sum()
    
    #Assign calculated columns, **code above**#Create key column for iterating
    grpd['group'] = grpd.device + ' - ' + grpd.is_slideshow
    
    chart_data = {}
    for group in grpd.group:
        segment_changes = {}
        segment = grpd[grpd.group==group]
        segment_ly = segment[segment.year==ly]
        segment_ty = segment[segment.year==ty]#Calculate waterfall chart values, **code above**#turn results in DataFrame
    chart_df = pd.DataFrame(chart_data, index=['Page Views','Impressions Per Page View','CPM'])#calculate start and end total
    ly_rev, ty_rev = grpd.groupby('year').revenue.sum().values#flatten data so it can be plotted
    flat_data = list(chart_df.values.flatten())#add start and end totals to flattened data 
    flat_data.insert(0,ly_rev)
    flat_data.append(ty_rev)#format flattened data as currency
    flat_data_txt = ["${:,.0f}".format(x) for x in flat_data]#create waterfall chart
    fig = go.Figure(go.Waterfall(
        orientation = "v",
        measure = ['absolute','relative','relative','relative','relative','relative','relative','relative','relative','relative','relative','relative','relative','absolute'],
        x = [['Total','Page Views','Page Views','Page Views','Page Views','Impressions Per Pageview','Impressions Per Pageview','Impressions Per Pageview','Impressions Per Pageview','CPM','CPM','CPM','CPM', 'Total '],
            [ly,'Desktop Non-SS','Desktop-SS','Mobile Non-SS','Mobile SS','Desktop Non-SS','Desktop-SS','Mobile Non-SS','Mobile SS','Desktop Non-SS','Desktop-SS','Mobile Non-SS','Mobile SS',ty]],
        textposition = "outside",
        text = flat_data_txt,
        y = flat_data,
        connector = {"line":{"color":"rgb(63, 63, 63)"}},
    ))fig.update_layout(
        title = '{} YoY Revenue Change'.format(market),
        showlegend = False,
        width=850, height=600,
        uniformtext_minsize=9, uniformtext_mode='hide'
    )
    
    if not os.path.exists('Automated Waterfall Charts'):
        os.mkdir('Automated Waterfall Charts')
    fig.write_image('Automated Waterfall Charts/{}.pdf'.format(market))

Merge individual charts into a single pdf:

#Merge individual charts into single PDF
def merger(output_path, input_paths):
 pdf_writer = PdfFileWriter()
 
 for path in input_paths:
 pdf_reader = PdfFileReader(path)
 for page in range(pdf_reader.getNumPages()):
 if page == 0:
 pdf_writer.addPage(pdf_reader.getPage(page))
 
 with open(output_path, ‘wb’) as fh:
 pdf_writer.write(fh)paths = [‘Automated Waterfall Charts\\All Web Sites — Broken Out.pdf’,
 ‘Automated Waterfall Charts\\Site1.pdf’,
 ‘Automated Waterfall Charts\\Site2.pdf’,
 ‘Automated Waterfall Charts\\Site3.pdf’,
 ‘Automated Waterfall Charts\\Site4.pdf’,
 ‘Automated Waterfall Charts\\Site5.pdf’]
merger(‘Automated Waterfall Charts/{} MTD Waterfall Charts — Broken Out.pdf’.format(pd.to_datetime(‘now’).normalize().strftime(‘%Y-%m-%d’)), paths)

In the end, six different reports are autogenerated in PDF, one for each site as well as a single pdf containing all sites!

The example below shows that we can estimate an exact dollar amount to the change driven by each subcategory.

Looking at the breakdown of site 1, you can see that the losses in CPM were more than offset by the gains in page views and impressions per page view, resulting in a lift in total revenue. — Based on the waterfall chart above, we can determine that growth in revenue from page views and impressions per pageview were able to more than offset the steep declines in CPM.

For the full code and source data, please visit https://github.gatech.edu/jhopper6/waterfall_demo.

Using Waterfall Charts in Python to analyze Programmatic Advertising Revenue change.

Written by James Hopper