Who Hacked the Election? Ad Tech did. Through “Fake News,” Identity Resolution and Hyper-Personalization

Jonathan Albright
Tow Center
Published in
10 min readJul 31, 2017
Complete graph of ad tech infrastructure, tracking, unique ID, and server technologies from earlier group of “fake news” sites

Identifying the Identifiers:

Several months ago, I captured hundreds of trackers, scripts, and “ad tech” resources that loaded onto my computer as I visited a group of 110 hyper-partisan, parody, hoax, pseudoscience, and propaganda (ie, “fake news”) sites. These sites form part of what I call the “micro-propaganda machine.

Since the issue is still at the center of the “election hacking” and voter “micro-targeting” debates, to better understand the role of this weaponized tracking infrastructure in the news ecosystem, I spent some time filling this network out with more complete data. To do this, I collected an extensive list of all the software, companies, and services that these scripts, cookies, APIs, unique identifiers, content customization services, business intelligence services, and ID resources were “calling home” to from my earlier ad tech “scrape” of the same sites.

This time around, using a set of tools including Threatcrowd, Maltego, and Gephi, and along with some advanced spreadsheet and data viz work, I revisited this group, adding the deep layer of ad tech, content customization and targeting technologies, and A/B testing platforms that this “fake news” behavioral tracking infrastructure is meant to “deliver on.”

The data I present here suggests that before we keep pointing fingers at specific countries and tweeting about companies “hacking the election,” as well as to solve the scourge of “fake news,” it might be good to look inward. By this, I mean we should start the quest for transparency in politics with a few firms based in New York City and Silicon Valley.

The Sources

Before the “ad tech is everywhere” and “voter targeting is nothing new” arguments come up, remember: I’m not talking about Slate, Buzzfeed, the National Review, or even #12 top tracker awardee and shell site the Drudge Report, but from a highly coordinated campaign to drive traffic to a list of players such as:

Chicks on the Right: A site without a header (title) on a mobile-unoptimized site front page?

Conspiracy Planet: The lead is a Podesta Pizzagate YouTube video embed

I Have the Truth, dot.com. Keyword domain bulk purchase discount, anyone?

Knowledge of Today: Links to FREE BOOKS…in cardboard boxes. And 921k Google Plus followers and hundred or so thousand Facebook fans, too.

KNOWLEDGE OF TODAY

I hope some of the detailed data I’m sharing below can explain how a Wordpress 2.0-esque site like “Knowledge of Today” (see directly above) could have 921,000 followers on Google Plus and 29,000 likes and 58,000 likes on Facebook for its “FREE BOOKS: 100 Legal Sites to Download Literature” and “List of 40 FREE Educational Websites” posts, respectively.

Ad tech infrastructure graph for “I Have the Truth.com”

On the surface, most of these sites are simple. But what’s behind many of these hyper-partisan, conspiracy, SEO spamming, and “I’ll go-there-once just to see if that video is really THAT bad” sites in terms of behavioral tracking and pervasive voter identification and message personalization, however, is a completely different story. Keeping in mind that I didn’t include any of the more reputable third-party sites “discovered” in my larger hyperlink networks, I found:

<>132 unique identifiers*

Unique identifiers range from standard fare “cookies” based on Google ad IDs and Facebook pixels/super-cookies to more pervasive (borderline malware) trackers that record browsing behaviors, relay specific behavioral events (mouse “hovering,” etc.) and an unusual “text-to-speech” technology. Here’s a text file with the full list of 123 tracking IDs with the codes and source information.

{}362 Server-side technologies

While approximately half of the server technologies and software infrastructure I found represent commonplace software packages and application kits such as Wordpress, Drupal, jQuery, and font/javascript APIs, it was alarming to find highly sophisticated tracking technologies, advanced programmatic ad delivery, AI content optimization, cross-platform personalization, and SDK social integration as part of the technology woven into these unsophisticated “fake news” sites. In a couple of instances, there were outbound links to known malware and botnet-associated servers (NJINX).

Like to read? Here’s a 523-page PDF with a breakdown of every single piece of technology that played a role in the ad tech infrastructure behind the group of “fake news” sites from my earlier exploration.

##Advanced Social Integration

To name a few of the more powerful application integrations, at the relative center of this conspiracy/hoax/propaganda ad tech network was … Facebook Custom Audiences, Facebook Domain Insights, Facebook SDK, Facebook Pixel, Google Universal Analytics, Adblade, and Wordpress automation, aka WP Daily Activity. So…what does it mean (.com)?

Facebook Custom Audiences and Friends

This means that scores of highly sophisticated technology providers mostly US-based companies that specialize in building advanced solutions for audience “identity resolution,” content tailoring and personalization, cross-platform targeting, and A/B message testing and optimization — are running the data show behind the worst of these “fake news” sites. To highlight a few of the players behind this network of propaganda commanding millions in revenue from their services, these include companies such as:

  1. Acxiom LiveRamp, a retargeting CRM and “identity resolution provider” that specializes in “tying marketing data back to real people”; Linked through ad tech to conspiracyplanet (see above example with YouTube Podesta video), truthrevolt, and truthandaction;
Acxiom LiveRamp: $310 million ad tech behind conspiracyplanet (see above with Podesta video), truthrevolt, and truthandaction
LiveRamp

2. Genesis Media, a large data science firm which claims to “analyze how each piece of content is consumed across 28 million publisher pages,” including collecting data on “time spent and interaction with articles”;

Genesis Media: behind thepostemail and theamericanmirror
Genesis Platform (UK and US) snippet

3. Dstillery, a Media6Degrees firm that made its name by “tapping the social graph of MySpace,” and using this data to segment and target prospective audiences, linked via ad tech to globalresearch dot ca;

dstillery: behind globalresearch dot ca
dstillery snippet

4. Dynamic Yield, which recently won a $22 million funding round, provides “personalization, recommendations, behavioral messaging, A/B testing, and optimization in a single platform,” linked to weeklystandard;

Dynamic Yield Omnichannel: behind weeklystandard
Omnichannel Dynamic Yield snipped

5. EZOIC, an “intelligent testing and A.I. optimization platform” for automatically testing content, layout, and ads — founded by “the former CEO of the first Facebook advertising network,” linked to truthandaction and davidwolfe dot com;

Ezoic: Machine Learning AI tech behind truthandaction and davidwolfe
Ezoic snippet

6. “Magnet” by Klangoo, a “cross-lingual contextual-based Audience Engagement Solutions” firm that specializes in content personalization, app integration, and analytics, linked to globalresearch dot ca;

Klangoo: behind globalresearch dot ca
Klangoo snippet

7. Integral Ad Science, a company that claims to “empower the ad industry to effectively influence consumers everywhere, on every device.” Interesting that this company, which claims to be a “Google Ad Partner,” is linked to southfront dot org, a known hate site.

Integral Ad Science: ad tracking tech and Google Ad Partner behind southfront
Integral Ad Science snippet

8. Vibrant “Intellitxt,” a real-time keyword tracking advertising platform that works via a loaded script to “highlight words and phrases of interest” in real-time through pop-ups with “relevant advertiser information,” is linked to eutimes dot net;

Vibrant Media IntelliTXT: keyword script-based ad tech behind eutimes
Vibrant Intellitxt snippet

Freewheel StickyAds.tv, a Comcast-owned “multiscreen video advertisement technology company that provides programmatic video solutions for publishers and advertisers,” is linked to debunkingskeptics dot com;

Stickyads.tv — an ad tech firm acquired by Comcast through Freewheel: behind debunkingskeptics
StickyAds.tv programmatic video snippet

9. Specific Media’s Viant Advertising Cloud, offers “original programming, cross-channel distribution, and addressable advertising” to drive audience viewership, is linked to globalresearch dot ca;

Specific Media Viant Ad Cloud: programmatic/ID-based content delivery behind globalresearch dot ca
Viant Ad Cloud snippet

10. Zeta Global’s “Boomtrain,” a firm reported to use machine learning and artificial intelligence to send “personalized messages from brands to consumers” with an “acute degree of personalization,” is linked to angrypatriotmovement dot com and other religious fringe sites;

Zeta Global “Boomtrain”: $40M tech behind angrypatriotmovement, christianpost, and proamericanews
“Boomtrain”? … really?
All aboard: the Boomtrain is leaving with your personal information

The Bigger Picture

The expansive map at the beginning of this article is a version of the “shadow tracking” graph I posted months ago, but with all of the server technologies, ad tech companies, server infrastructure, and external API calls and unique identifiers added. That’s from just over 100 sites. Can you imagine what a list of 500 misinformation and propaganda sites looks like?

The conclusion: While this set of “fake news” sites might not have the sheer quantity of ad tech that, say, the Alexa 500 have, the behavioral targeting and identity resolution technologies associated with many of these conspiracy, hyper-partisan, and propaganda sites are as sophisticated as it gets.

Facebook Custom Audiences — near the center of the graph above — for example, can be used to easily target voters in real life based on curated lists from something as simple as an Excel workbook. But most often this is done professionally through a “trusted data partner” like Acxiom (alarming, since example #1, the “LiveRamp” tracker above, is part of the same company)

https://www.facebook.com/business/help/community/question/?id=10154339555249182

This set of data provides irrefutable evidence to show that the tech behind these often “fake” sites is very real. And through acquisitions and $20 million funding rounds, the field as a political weapon is evolving rapidly. It includes data science startups as well as global media players which build services that use — and train — prediction models through advanced machine learning, natural language processing, and pervasive cross-platform personal data collection. Not just to collect information, but to create and deliver it.

Facebook’s Custom Audiences (see graph above), is really just the first step in a long chain of “real” behavioral manipulation. These firms and the technologies they design specialize in creating our life bubbles — and it’s not just based on a few links we click on or an article we share. It’s based on everything we buy, what we talk about, when we talk about it, everywhere we go, what time we eat lunch, what size of Starbucks we order, the GPS location we read a political news story, the way we hold our phones, and the people we associate with. Online and offline. It’s based on the way we live.

The ten ad tech examples featured above are one thing. But when thousands of these companies and technologies are merged across Facebook and Google data, shopping loyalty cards, rogue social quizzes, identity dumps from hacked sites, Rentrack/Comscore data (their trackers are in this set, see the full report), voting rolls, and the ad tech juggernaut outlined here (and in my 523-page report), this takes the gravitational “pull” messaging and emotional hyper-targeting to the next level.

These firms I’ve shown specialize in connecting online profiles and behavior to real purchases, shopping habits, credit profiles, and offline behavior to a real person. But only this time, it’s voting.

Domestic and international companies are being paid to identify American voters in real time, and expose them to highly tailored messages at specific times, specific places, and while certain friends, family, and emotional cues (e.g., television shows or breaking news) are present. These aren’t just any other sites— these are known purveyors of propaganda, lies, hoaxes, malware, and misinformation. Where do we even begin to draw the line?

--

--

Jonathan Albright
Tow Center

Professor/researcher. Award-nominated data journalist. Media, data, & tech frmly #columbiajournalism #towcenter #berkmanklein #elonuniversity