Eating In The Dark — #1 — Adtech & Adfraud Research

Shailin Dhar
10 min readJun 19, 2024

--

I ran the content-site-monetization business for two of the largest MFA-traffic networks from 2013–2015. I have invented bot-identification solutions and led research that has contributed to the current shift towards deterministic solutions rather than probabilistic. I also wrote a book for advertisers/agencies to navigate digital advertising for transparency on metrics and money. Having spent time boardrooms, c-suites, trade groups, and many other influential spaces in this industry of digital advertising, I have gotten to know my way around the intersections of technology, media, and telecommunications. I focus on the wider TMT sector primarily now, but with my background in ad-technology, certain developments are difficult to ignore. This week (June 17–21, 2024), I will provide a five-part overview in writing and images on how that worked and how it could still work today. This first piece is to set a tone on how to assess research and where the current degree of research continues to fall short.

Introduction to the Series

Welcome to the first week of a new series, “Eating in the Dark.” This series delves into the mechanics, economics, and processes behind the opaque practices in media, tech, and telecom infrastructure. This first installment explores the practices leading to a lack of transparency in media technology and digital advertising, the current state of ad-tech transparency, and related research. While I can’t commit to a specific posting schedule, I will strive to share valuable insights and research findings regularly as they emerge from our work at Futureproof TMT.

NOW LET”S GET INTO IT

Early Days of Adfraud Research

Adfraud research” was practically non-existent in the advertising marketplace before 2014/2015. A few pioneers pursued this thankless task, often viewed as stray dogs howling at the moon. Occasionally, someone would pause to feed us or join in, seeking temporary satisfaction.

In 2016, when advocating for new industry guidelines, I was often labeled an alarmist for claiming that over 50% of web traffic was non-human, predominantly originating from virtual machines in data centers. Industry guidelines and accreditations have been outdated and serving commercial interests more than technical veracity. Today, that perception has shifted. The marketplace and technical tools available must start to address different needs than those of 8–10 years ago.

Here’s some of my early work from 2016 if you’d like to familiarize yourself with it:

Link to my report on Mystery Shopping in the Ad-Verification Space

Link to my report on eZanga and how it sells traffic designed to pass verification filters.

Challenges in Ad-Tech Research

Ad-tech research is fraught with challenges, especially regarding sample sizes. Much like dipping a stick in the ocean, our sample sizes are often non-representative. Advertising, too, is built on such samples. (Stay tuned for next week’s piece on Nielsen Panels!)

We also see the problem of releasing research at a rapid pace, driven by the long tail effect. Research is often released regardless of whether it has a publication breaking it. This leads to a deluge of claims that may distort the stated caveats. (Or research reports where there are more caveats than claims — so what is the point?)

The Purpose of Research

An essential question arises: “Are you doing this to facilitate growth or to become famous?” The great lyricist, Jermaine Cole, said in a track titled “The Climb Back”, alludes to this quote from the audiobook of The Tao of Leadership, by John Heider. Mr. Cole also spoke one of the most poignant lines in a rap verse for my bot obsessed mind on the track “a lot” where he is featured by 21Savage. “Artists be faking their streams, getting their plays from machines. I can see behind the smoke and mirrors, artists ain’t really big as they seem.”

There’s a delicate balance between producing meaningful research and seeking recognition. The integrity of data aggregation on supposed technology platforms is often lost in the rush for publicity. Media coverage of this research is another problem, as reporters may lack the qualifications to verify the findings.

Does your sample serve your extrapolation process?

[do you sample in a way that serves the extrapolation process?

Or does your extrapolation serve your sample convenience?

[do you extrapolate in a way that serves the sample?]

Let’s look at a recent example of Adalytics’ supposed research on Direct Digital Holdings’ Colossus SSP. While it was maybe entertaining to watch an “outsider” throw rocks at Google, I was very uncomfortable watching the mob of unqualified critics that had built up by the time Adalytics threw mud at the walls of DDH.

While I didn’t know Adalytics or Colossus directly when this hit-piece masked as a research report was published, I had previously had conversations about bid-stream mechanics with DDH’s CTO, who I found to be exceptionally honest, sharp, and sincere. I spoke to Adalytics later in May, and was underwhelmed at the clarity of position and lack of consistency in how this magical “log-file merging” transparency tech works. If you aggregate the data to merge it, you’re not merging log-level events…but who cares about words meaning what they mean in this industry. Language seems to be the most fungible commodity in advertising and marketing; hyperbole is the backbone of copywriting in my view. (**Side note: If you are out here in the market talking about how you connect log files from the ad-tech supply chain, I laugh at your performative untruths. You either know it doesn’t work or have a data team that lies to you. If you are confused about why, reach out to me!** ^cough, mag.mrust.net, miducia cough^)

The reason it made me uncomfortable to watch the coverage and public response to the Colossus SSP allegations, were first because many non-minority owned, non-multicultural focused ad platforms have been revealed to have deep rooted, massive scale issues, and not faced scrutiny that extrapolates a small misinformed observation into a core-competency scandal. I can’t overstate how watching the press headlines and sinister linkedin comments made my stomach turn.

I was in disbelief at the swiftness of companies like Bidswitch and TradeDesk to distance themselves from DDH/Colossus rather than spend basic amounts of time to read beyond the headlines of a pseudo-researcher and really think about how the alleged practice would even work. Watching a company led by smart & determined melanated people that were building a publicly traded (investable) business to help monetize underrepresented media and audiences, was inspiring for me. I am still inspired by their resilience and committment to correcting this public perception while not complaining about the unfairness of a swift and incorrect judgment by many in this industry. I guess that’s just the way it is still, for now.

Here’s a few issues I have with this may 10th adalytics release about colossus ssp:

Sample size: “dipstick in the ocean”. Adalytics stated repeatedly in its report about its convenience sample used in its analysis of Colossus SSP. If you believe that was good enough work, you’ll probably also believe that Nielsen’s 40,000 american households are representative of American people demographically and their complex content consumption behaviors. I vehemently disagree.

How is a convenience sample justified in alleging widespread fraudulent practices

No real peer review. Not sure whether any of the reporters that co-opted this nonsense understood the reality of adalytics peer review process, but peers must actually read & respond for it to be considered reviewed! Peer review doesn’t mean sending a document to a mailing list. And this is market research, nobody needed this specific facade of “scientific process” all other components were not followed. Also on this point, who are these peers that are qualified to review research of bid-stream mechanics and what parameters facilitate auctions? If the people that I know are qualified to do so were part of this peer review, it sounds like a conflict of interest to me.

Feasibility of ID swapping: This core allegation of swapping U.ID’s for a higher CPM bid was so stupid. If you believe a small ssp was doing that, instead of selling this to other ssp’s, you’re silly too. See expanded point below.

No public efforts made to ease or soften the public aggression towards Colossus/DDH, despite motivated parties going beyond the words of the already misguided and inaccurate report. I would just retract the report and move on, if I were him. But I’m not and would never have published something this nonsensical to begin with. Litigation can be like an expensive colonoscopy, with a public record; it is not good.

Adalytics states repeatedly that they are not a research company, but are a “Tech Platform” that gives advertisers transparency by connecting log level data. That is not a real technology or a business. Data sets with disparate URN’s, malformed records, and missing parameters are what plagues the practice of using log-level data in adtech systems. I will happily admit I am wrong if someone can show me how this sorcery would work of matching disparate event tables without a join-key. If you tell me its based on timestamps, I will understand that you have never actually done this yourself.

ID Mismatch :

  • Allegation: Colossus SSP intentionally and systemically modifies the user-ID/s
  • - Implied Logic of Perpetrator: Colossus/DDH have a database of user-ID’s of recorded cpm/bid value, and swap out a lower value ID for a higher value ID, at the time of passing the bid-request from the publisher to bidswitch-ssp-exchange (SD sidenote: This is a hilarious assertion for anyone that understands that anyone would this capability would operate a DMP instead of an SSP).

Peer review must be genuine, involving actual peers and thorough review processes. My experience in conducting, writing, and publishing research has taught me that most work, even good work, never reaches the public sphere.

Be weary of researchers, individuals and firms, that push out inflammatory research more often than they release solutions or offer guidance. My conscience is clear on this, as I always make a point of suggesting 2–3 autonomously actionable solutions to people who asked, before mentioning how my technology offering could help make that job easier. (So many examples with trade groups like the ANA that continues to disappoint in its seriousness of facilitating helpful guidance for their members about adfraud and digital waste.)

Be weary of reporters (and publications) that put out repeated “bombshell”, “explosive”, “unveiled”, or other hyperbolic descriptors of the research they are covering. (*Think back to 2016 US election news coverage*) Too many “bombshells” without any change in the state of the system sounds like irresponsible reporting to me. Almost like the news coverage, even for ad-tech trade press, is “made for traffic” (replace traffic with words like clicks, ads, views), very meta (pun intended).

Every project teaches us something, but not all results are notable. If everything is notable, then nothing is notable.

Personal Reflections and Experiences:

Reflecting on my journey, I remember the chaos of September 2017 when an advertiser audit of Criteo’s PPC offering was leaked by a short-seller website. My day was consumed with calls from hedge funds, PE firms, and reporters. One piece of coverage linked here. It was a terrible day from my professional nightmares, come to life. I did not like how it felt to see my work co-opted for other purposes that would never lead to developing solutions. I believe in improving the durability and strength of our methods rather than merely rearranging old tactics: address the system, not just its components.

Approaching Adfraud with the right mindset:

Neither Adalytics and Fou really understand “how” adfraud is committed or activated my motivated parties. So both jump to thinking that anything beyond their comprehension of underexposed wisdom should be labeled fraud. Gets a lot easier to do so when there is even a small group of people anticipating your next statement so they know where to throw their rocks. Also on a transparency FYI, Fou blocked me on linkedin/twitter in 2021 because he didn’t like my comments. I don’t consider him a practitioner of science, and his doctorate has absolutely nothing to do with web technology, so I find him to accomplish very little except making me roll my eyes when I get sent things he says about me while knowing he has blocked me. (I haven’t blocked him, so hopefully he reads this one day). Adalytics is equally, probably more goofy in its approach of claiming to be a scientist but skipping that whole pesky scientific process thing. I didn’t like the direction of Fou’s posts and the kind of suggestive language he was using, so I commented as such. He blocked me that day, about three years ago now. I will say, my life is better in those years. Is that correlation or causation? The above mentioned “researchers” might not be able to explain the difference to you unfortunately. (Anyone that believes I am being harsh, I welcome them to arrange a panel of adfraud researchers, and see what ensues.)

Adfraud should be approached as a counterfeit problem, not a cybersecurity one. I will be writing about this further in the coming weeks.

To help you think about what can happen when counterfeit is done at scale, look up Operation Bernhard from WW2. What happens when counterfeit currency is 51%+ of the total cash in circulation?

(For more on this, see this article from March 2017 in the Financial Times by David Eagleman, pdf link here. Email info@fit-holdings.com for a PDF version if you cannot access it.)

Follow ups & Reading:

Let’s shine a light on these hidden practices and work towards a more transparent and ethical ad-tech industry. While I welcome all feedback, please come correct. If you are not knowledgeable on the topic, it will generally be clear based on your comment or question. I learned, I think, to stop engaging with trolls many years ago online. My hope is that continues into my future.

If you have genuine suggestions on things I can read, watch, listen to, to further my understanding of any topics relevant to understanding how web traffic is monetized, please do share! I love to read more than I like to write. You should do the same.

--

--

Shailin Dhar

I’m a media, tech, and telecom expert. Invented first deterministic device/gpu based bot detection process in 2017. Data center obsessed. Decolonizing Himalaya!