Tl;dr: The Ad Observer team discovered that a code formatting change by Facebook led to Ad Observer’s inadvertent collection of the top public comments on ~200 Facebook ads. None of these comments were seen by anyone other than NYU researchers. The researchers deleted all comments and related ad text from the database and backups within an hour of discovery.
On October 4, in the normal course of work, an NYU engineer noticed that a very small handful of ads collected by Ad Observer, starting from September 13, also included the top public comments on those ads, identified by Facebook as “Most Relevant.” The engineer immediately suspended Ad Observer data collection of Facebook ads, so that no additional ads or comments would be collected. The engineering team then removed all comments and ad text data from the period of September 13 — October 4 from our systems.
Our ability to fully assess this situation is limited because this issue was occurring in less than .001% of ads collected by Ad Observer, and because we immediately deleted the comments from the small number of ads in which this issue occurred. Our initial assessment is that a very small number of ads (approximately 200 ads over a three week period) were formatted differently by Facebook, causing our browser extension to treat the top comments accompanying those ads as part of the ad and to collect those top public comments. This may or may not be related to Document Object Model (DOM) changes on Facebook’s part that originated between September 4 and September 13. A DOM change on September 4th broke the collection of ads by our browser extension, and those of other services, including screen readers. We released an update to our code on September 13 to adjust to that change by Facebook.
Although we determined that collection of top comments on ads was an extremely rare event, out of an abundance of caution, we deleted all ad text data from the time period (September 13 — October 4) from all of our systems, including backups. We notified NYU’s Institutional Review Board of this incident, because the data we collected exceeded the scope we defined for this project, and are grateful for their guidance on data handling and responsible public disclosure of this event.
Is the issue fixed?
Yes. We published an update on October 17th for our browser extension that accounts for recent changes we have observed to Facebook’s DOM and that explicitly filters out improperly coded ad comments to ensure that they are not inadvertently collected. We have carefully monitored content sent back to our servers since that release and are confident that our new safeguards are working. Data collection was disabled for much of the last month out of an abundance of caution, so users of our data will observe a gap during this period. Data collection is now turned on.
We conducted a post-mortem to analyze this incident further and identified some areas we want to improve. We want to shorten the time it takes for us to identify changes to Facebook’s DOM that might affect Ad Observer’s collection of data. We also want to find ways to make our code even more robust in adapting to changes and coding errors made by the ad platforms we monitor.
Why are we reporting this incident?
At NYU Cybersecurity for Democracy, we are dedicated to principles of transparency as a means toward improving online security via accountability. Before publishing this post, we disclosed this incident to NYU’s Institutional Review Board, and are now doing so to the public. We believe that trust is earned and collective knowledge broadened when parties provide transparency about such incidents.
Would you like more information on our work? Visit Cybersecurity for Democracy online and see how tools, data, investigations, and analysis are fueling efforts toward platform accountability. You can: