Publishing Facebook ad data (redux)

We are making publicly available volunteered information on how advertisers targeted political ads in the 2020 elections

Laura Edelson
Online Political Transparency Project
7 min readNov 13, 2020

--

Independently collected data about Facebook ads — and how they’re targeted to individuals— is finally available again. We’re publicly releasing the data collected by Ad Observer so journalists, researchers, and watchdogs can study it. We also make portions of this data available in visualizations on Ad Observatory, our public website showing trends in Facebook political advertising in the 2020 elections.

Facebook does not want us to make this dataset available to the public. In October, Facebook sent a cease and desist letter to block our project. More than 50 consumer, civic technology, and journalism organizations, as well as nearly 30 individual journalism professors, researchers, and leaders signed a letter demanding Facebook withdraw the letter and work with our team and other researchers to bring more transparency to political ad data. Members of the House Energy & Commerce Committee has likewise sent Facebook a letter urging them to work with us. Facebook has not responded.

Facebook’s history of blocking independent collection of ad data

In January 2019, Facebook blocked the previous version of Ad Observer, ProPublica’s Facebook Ad Collector, using a quiet technical approach that broke the tool. At NYU, we had been using the data generated by this project, finding particular value in connecting ad targeting data to the impression data made public by Facebook through the Ad Library API.

Facebook’s actions to block ProPublica’s browser extension received some public attention. However, Facebook did not back down, and at the time, there was nothing those of us who relied on the ad data could do. But in my heart, I’m an engineer — I don’t want to just complain about a problem, I want to solve it.

Along with several other researchers, particularly Jason Chuang, Cameron Hickey, and Jeremy Merrill, I worked to build a new browser extension platform. Ad Observer is a “fork” of that platform, building off that collaborative code. Working with our team, It took about a year from the first discussions of the project to the launch of Ad Observer in June 2020. Along the way, we brought over users of ProPublica’s Political Ad Collector to Ad Observer.

After all that work — and despite Facebook’s attempts to interfere with independent research into political ads — we’re finally getting back to the place the research community was in 2018. Facebook ad data, complete with targeting, will be publicly available. Here’s what’s in this dataset, along with an explanation of its limitations and possibilities.

How we collected Facebook ad data

The ads in this dataset were observed by users of our browser extension, Ad Observer. This method of collection has several limitations you should be aware of if you want to use this data for research:

  1. Data collected via web browser only — not mobile. Ad Observer collects data from users who access Facebook from a Chrome or Firefox browser on a computer, as opposed to a mobile device. This means that we will have no observations of ads that were only shown on mobile devices.
  2. (Relatively) small number of volunteers. As of November 2020, we have about 16,000 browser extension volunteers. We think a lot of the users we have are in the United States based on the ads they are observing, but because we don’t collect any personal information about our users, we’re not sure. Also, that is a tiny percentage of overall Facebook users–some 190 million in the U.S. alone–and nowhere near enough to form a clear overall picture of Facebook advertising or even of Facebook political advertising.
  3. Not a representative sample. We have no reason to believe that the volunteers who choose to install our browser extension and share their ad observations with us are a representative sample of social media users, or of the general public in any country. We have a lot of reasons to think that they are not. This means that these ads are also not a representative sample of political ads shown on Facebook. In particular, we see a skew towards ads from Democratic candidates shown in blue-state urban areas.
  4. Incomplete ad information. Ad observations are not complete information about a particular ad. The more observations we get of a particular ad, the more complete a picture we can get, but remember that many ads have multiple images and texts associated with them. Each user is usually only shown one combination, but there may be others. Also, Facebook’s ad targeting explanations shown to users are rarely complete.

How we identify non-disclosed Facebook political ads

One benefit of Ad Observer data is that we can find ads that Facebook did not label as political ads. These fall into two general types:

  1. we can identify political ads that were not disclosed by the advertiser as political at the time of ad creation, and
  2. were not caught by Facebook’s own detection mechanisms.

How do we do this? First, we developed a regression model for ads that may be considered political according to Facebook’s own policies. We didn’t want to get into the question of what the appropriate definition for a political ad on the platform would be–though we are researching that separately. Instead, we designed a model that evaluates the text in observed ads to find similarity to the text of ads that Facebook has designated as ads on “Social Issues, Elections, or Politics”. We then gathered all ads for which the classification model has a c >0.9 confidence.

These ads — which we are calling the “Observed Political Ads” dataset — fall into three main categories:

  1. Ads exempted from disclosure. Our model looks for text similarity, but does not categorize ads that may be exempted from disclosure by Facebook for other reasons. For U.S. ads, the most common reason for this is the news publisher exemption. Facebook doesn’t make the list of advertisers who have qualified for this exemption public, so we couldn’t even exclude them if we wanted to.
  2. Ads similar to unnecessarily disclosed. Some advertisers classify their ads as political out of an abundance of caution, even if such ads would not trigger Facebook’s own algorithms. However, these ads are then included in the Facebook Ad Library as political. Because our model is based on ads included in the ad library as political, we may also capture ads that Facebook would not have designated as such through its own detection methods.
  3. Ads Facebook missed. Sometimes Facebook misses ads that were neither labeled as political by advertisers nor qualify for a Facebook defined exemption. Our dataset includes as much of this third category as possible, with the understanding that that means that we can’t totally avoid the first and the second.

What you’ll find in this Facebook ad data set

We provide data in the form of a CSV file, where each row is an ad that was observed one or more times through our Ad Observer browser extension. Here are descriptions of the columns and information about the data.

  • Ad Id — Facebook Ad Id.
  • Page name — Page name at the time of first observation.
  • Political value — Political confidence value from the Ad Observatory classification model.
  • Paid for by — Who paid for the political ad.
  • Ad Text — Text of the ad.
  • Links — Links that the advertiser provided to Facebook.
  • Language — Language of the ad.
  • Targeting — The targeting parameters for that ad that were observed by our users.

The CSV file contains two different types of observed ads. The first type is ads that are potentially political, but was not disclosed as such by the advertiser when they were observed by our users; therefore these ads do not contain a disclosure string. These ads include a field named political_val, which contains the confidence rating that the model provides, or indication that it may be political. (The values we are reporting are all greater than .9.) The second is disclosed political ads. The disclosure string provided by the advertiser for each of these ads can be found in the paid for by field in our dataset. The political ads that contain disclosures are also indicated by a political_val, of “-1.”

We will update this file daily with the previous day’s observed ads. We may provide this information more or less frequently in the coming days and weeks, as we monitor the volume of ads collected by the Ad Observer extension. We will also add to the types of information we provide and will provide explanatory updates in this post.

Where can I find the Facebook ad data?

Download the file of observed political ads from this public Google Cloud Bucket here. We will update the file daily, appending additional new ads to those that are already in the file.

Ideas and examples for journalists using Facebook ad data

  • Dozens of news organizations used Ad Observatory to report on political ads using Ad Observatory data.
  • This piece by Jeremy Merrill explains how targeting data is particularly useful for reporting stories on political races.
  • Read Craig Silverman from Buzzfeed on political ads Facebook missed in lead up to the 2020 elections.
  • Here’s Merrill in the Markup using Ad Observer data to show how Facebook still runs discriminatory ads, despite promising to cut down on this practice.
Example of ad targeting data collected by Ad Observer displayed on Ad Observatory in lead up to elections, showing one of the Georgia Senate races that is heading for a runoff in January 20201.

Tell us about your Facebook ad data analysis!

We want to hear from you about what you find in this dataset and what you do with it, whether it results in an article, a study, or some other analysis that helps us all understand Facebook political ads in the 2020 elections and beyond. Questions? Please email us at info@adobservatory.org.

Sign letter in support of Ad Observatory project

Want to join the 50 consumer, civic technology, and journalism organizations, as well as nearly 30 individual journalism professors, researchers, and leaders who signed a letter demanding Facebook withdraw its cease and desist letter? Contact us at info@adobservatory.org.

We will continue to write about technical problems with Facebook political ad disclosure. Check out Ad Observatory, a searchable site revealing trends in Facebook political advertising in the 2020 elections, and download the Ad Observer plug-in tool to safely volunteer information for researchers and journalists on what ads you are seeing on Facebook.

--

--