The Use Of Data Science For Optimizing Programmatic Advertising

7 min readAug 15, 2018

Sample BuzzFeed’s websites with programmatic ads displaying

This is the 10th week of my data science internship at BuzzFeed, so I am writing this blog to reflect my days in this summer. It’s been a fruitful internship in Ads Team where I took my first step into advertising from a publisher’s side and applied data science technique to optimize programmatic advertising revenue.
To summarize my internship, I decided to publish this technical-focused blog which includes: introducing the problem, methodology, results, and discuss issues in my project.

Now let’s start! 🚀🚀🚀

What Is The Goal Of This Project?

The ultimate goal of this project is to optimize BuzzFeed’s programmatic ads revenue💰. To achieve this goal, there are three questions to answer:

What are the key factors that drive the ads value?
How to make basic prediction of ads prices?
How to plan a business strategy to optimize ads revenue?

Problem Formulation

Conduct an exploratory analysis on what are the factors that drive effective Cost Per Mille prices(eCPM, the metric I used to value ads) of programmatic ads
From conclusions in the exploratory analysis, fit machine learning algorithms that would predict eCPM prices
Based on conclusions from (1) and predictive models (2), propose a plan to optimize the programmatic revenue

What I Did On Programmatic Ads Data?

1. Study Programmatic Advertising

What Is Programmatic Ads?

“The use of software to purchase digital advertising, as opposed to the traditional process that involves RFPs, human negotiations, and manual insertion orders. It’s using machines to buy ads, basically.” — Definition from Digiday

Sample Ad placements on BuzzFeed’s pages (ref. to blog)

BuzzFeed, as a publisher, is on the seller’s side in online advertising market. Ad slots on publishers’ pages are available for advertisers (buyers) to bid via an real-time “online auction” system. In general, the advertisers with the highest bidding price win the impressionat one ad slot, then advertiser’s campaign will be displayed through webpage/app to the audience. Publishers get paid by running programmatic ads on their pages, and they expect digital marketing buyers to give higher prices for their ads.

2. Google Ad Exchange Data

Data Source & Location

In this analysis, the programmatic ads data is part of EBDA, acquired from DFP Adx (Google Ad Exchange). To locate the data in the big ads framework of BuzzFeed:

Data location in partial BuzzFeed’s Ads product picture

Data Facts

Data are time sequenced by day
Feature space consists of both dimensions and metrics selected from DFP. Dimensions includes Ads types, Ads locations, Ads sources, Advertiser verticals, DFP Ad Units, Creative sizes, and Days; Metrics are Ad requests, Matched requests, Ad eCPM ($)(Ad eCPM is calculated by Revenue / Ad impressions * 1000).Features are selected after interviewing professionals and literature review
Majority features are categorical

DFP Data Limitations

DFP API limits maximum 10 dimensions in each query request
Metrics are auto-aggregated by DFP API query tools, where the aggregation depend on the selection of dimensions. Hence, it is not possible to combine datasets to enlarge feature space, which constrains the possible number of dimensions can be included in this analysis
Lack of data. Some features cannot be captured such as ads density, audience features, hourly seasonality, etc.

3. Data Cleaning, Feature Engineering & Diagnosis

Data Cleaning

Data downloaded from DFP is intended for basic business analysis in Excel, thus using it for deep analysis in Pandasrequires data conversion and cleaning. For example, I converted currency format data into numeric; converted textual date into date.time format, etc. Also, I checked missing values and found majority data in Ad Locationsare ‘unknown’ (reasons).

Feature Engineering & Feature Diagnosis

Facing the constraint posed by API, feature engineering becomes extra necessary in this analysis. I started with a feature space of 9 dimensions. After feature engineering, there are 62 features.

To make sure there is no redundent feature in the feature set, I conducted feature diagnosis by calculating the correlation matrix that helps eliminate 13 high correlated features.

4. Modeling

Training Data: After feature engineering, 49 variables in the feature set. Target Variable: Ads eCPM prices 💰, a non-negative continuous variable.

Four models from sklearn, statsmodels, andKeras were applied to predict eCPM prices and identify key factors driving ads prices:

Model 1 (baseline model): Ordinary Least Square

Metric: MSE = 0.839 (terrible)
Comments: There exist collinearity and nonlinearity between features; There are too many variables for a simple linear regression model, so regularization on variables is necessary. Thus, I moved onto Lasso Regression.

Model 2: Lasso Regression

Metric: MSE = 0.832 (improved slightly, but still bad)
Comments: After regularization on variable, the model still doesn’t perform well. This results from the nonlinearity between features, so non-linear models would be better fitted; However, due to smaller number of features included in the model, it took shorter time to train model and parameter tuning; Since Lasso’s hard regularization on coefficients, this model helps with feature selection, where non-zero variables have more influence on target variables.

Model 3: Random Forrest Regressor

Metric: MSE = 0.678 (improved, but not good enough)
Comments: This model could deal with non-linear relationships in data, however it took long time to train and difficult to tune the hyper-parameters; The best thing about tree-based model is feature importance, so we could easily figure out which are key factors.

Model 4: LSTM

Metric: MSE = 0.003 (nice)

Predicted Values (orange) and Observational Values (blue)

Comments: Since the data is time-series, above models are not good at capturing time-sequenced patterns, I fitted a two-layer LSTM network with Kerasusing the prior 50 days’ historical eCPM data as features to predict the next day’s ads price. The model performed very well (MSE = 0.003). However, it cannot be applied to make event-based prediction. In event-based scenario, given a set of features like creative size, advertiser verticles, ads format, etc., the model should output a prediction for an individual ad. While, we may use the prediction of LSTM model as one feature together with other ads features to make more precise predictions using machine learning algorithms.

What Are Key Factors?

From both Lasso Regression and Random Forest models we could find out the key factors that affects eCPM prices of programmatic ads:

Important Conclusion: Both models gives the same result that creative area (size), Month (time), and Ad Types are key factors driving the ads prices.

The rest of my analysis is based on this result.

5. Dive into Key Factors & Ad Units Placement

Knowing the key factors driving programmatic ad eCPM prices, I conducted deep analysis to see the variation of ads prices when each key factor changes. Some of my results are mirroring facts of US online advertising market. Below are general rules to publishers when optimizing ads:

From historical ad eCPM prices data, advertisers tend to buy ads higher by the end of month and on US holidays. Also, on Fri, Sat, and Sun, the eCPM prices are slightly higher (about 5–7% higher in ads eCPM). Hence, publishers need to act accordingly to rise impressions on high-CPM days to maximize programmatic revenue.
Rich mediais the major ad format on BuzzFeed’s products, more than 1/3 programmatic ads are in this format. From historical deals, I found that rich media and video ads are higher priced than other ad formats. Hence, rich media contributes a large portion of programmatic revenue due to its amount and high eCPM prices.
In terms of creative size, 300 * 250is the major ad slot size on BuzzFeed. The figure below shows the variation of average eCPM prices for different Ad formats in size 300 * 250. Video is much higher priced than other ads formats in this creative size, flash and rich media are alike in terms of format, so their eCPM prices are close. Hence, it is wise to get more video deals to 300 * 250 ad slots on BuzzFeed’s pages.

In addtion to above summary, I have more interesting results which is specified to BuzzFeed and conducted an independent ananlysis to BuzzFeed’s branch products such as Tasty.cobased on ad units placement.

Research + Conclusions + Optimization Recommendation = Proposal

By the end of my internship, I wrapped up all my findings and delivered a detailed report to the team which contains more detailed information of the content introduced in this blog, plus instructions of how to conduct A/B testing to optimize programmatic ads revenue by adding or replacing ads slots on BuzzFeed’s pages.

7. Future Work

I did a lot research about programmatic advertising and found that majority of studies in this domain is standing at advertisers’ side. There are very few published papers proposing optimization strategies for publishers. However, it is an interesting problem for publishers to solve to optimize profit margins.

In this project, I took the advantage of machine learning algorithms identified three key factors driving ads prices and predict ads prices. To get more comprehensive results, on the one hand, I believe causal inference is a good way to go. A good causal analysis could explain the causality (cause-effect) between variables. However, conducting this method into practice is challenging, because it requires statisticians/data scientists have sufficient domain knowledge in online advertising to propose reasonable hypothesis, and this can only be acquired from work experience. On the other hand, no matter how efficient a model would solve the problem, adequate data is a must. Hence, publishers are supposed to start early to build up data collection pipeline to collect data.