What Is a Data Clean Room?

Hightouch
The Data Activation Blog
7 min readOct 27, 2022
Data Clean Room

Learn what data clean rooms are, how they work, and why they are paramount in the future of how marketers activate, analyze, and measure campaigns.

What Is a Data Clean Room?

A data clean room is a secure environment where organizations can collect data from multiple sources and combine it with their first-party data. Doing so allows marketers to leverage large, aggregated datasets of consumer behavior to provide insight into critical factors like performance, demographics, campaigns, etc.

Data clean rooms allow companies to extract value from aggregate datasets sourced from multiple parties while prioritizing user privacy and maintaining strict security measures.

The intended goal of a data clean room is to create a privacy-preserving space where customer data can be joined and utilized in a collaborative manner. Through a data clean room, two parties (usually, but not always, an advertiser and a publisher) can securely share data while controlling what, how, when, and where data is shared. Once the data sets are joined, marketers can extract value through profile enrichment, audience analysis, attribution, and activation.

Fig 1. Data Clean Room Architecture

Upon finishing this blog, you’ll know what data clean rooms are, how they work, and why they are paramount in the future of how marketers activate, analyze, and measure campaigns.

Why Do Data Clean Rooms Matter?

The rise in popularity of data clean rooms can be attributed back to two main drivers; deprecated cross-domain tracking and the overall tightening privacy landscape. Thanks to the emergence of new privacy laws and regulations limiting the use of third-party cookies, marketers are continuously dealing with tracking/signal loss–hindering their ability to fully understand the effectiveness of their marketing activities.

To understand cross-channel behavior, marketers of the past used relatively simple (yet privacy-naive) solutions that relied on third-party cookies to track users across the web. Cross-domain solutions like Apple’s IDFA (identifier for advertisers) or Android’s GAID (google advertising ID) allowed advertisers to quickly and efficiently track users across domains and leverage those insights to power marketing, measurement, attribution, and everything in between.

Fast forward to today, where major web browsers like Google Chrome or Safari block third-party cookies, rendering cross-channel IDs like MAID and IDFA essentially useless. With updates like iOS14, marketers have little ability to understand user behavior across channels and domains, and this is precisely why data clean rooms are so important.

By securely aggregating and joining customer data sets in a clean room environment, marketing teams can derive insights from a wide(r) range of data sources to better understand their customers while also meeting all consumer data privacy requirements.

How Do Data Clean Rooms Work?

The first step in enabling a data clean room is for two or more partiers (again, typically a publisher and an advertiser) to collect, compile, and aggregate their 1st-party data at the user level.

These user-level datasets across the involved parties do NOT need to be the same, but they DO need a means of matching. Once in the clean room, these datasets can be matched using commonly hashed identifiers like email addresses, phone numbers, or user IDs.

After the data is assembled, it can then be loaded into a secure environment, AKA a “data clean room,” as long as it adheres to all of the predetermined agreements between the parties. Any type of user-level data (e.g., transactions, household information, behavioral events, etc.) can be loaded into this secure environment.

Once data has been loaded, the information is matched and cleaned. Depending on the agreement between the two parties, this often includes encryption, hashing, pseudonymization, access restriction, and obfuscation. When configuring the clean room, rules are applied to ensure that all parties ONLY have access to their customer data and the new, enriched data sets outlined in the original agreement.

Data Clean Room Steps

Data Clean Room Use Cases

Although data clean rooms have a wide range of use cases, they can typically be bucketed into three main categories; Profile Enrichment, Audience Analysis, and Measurement and Attribution.

Profile Enrichment

One of the primary use cases of clean rooms is to “enrich” existing customer data with data sourced from alternate parties. Doing so helps marketing and sales teams better understand the users leveraging strategic data points that would have otherwise not been available to them.

For example, a credit card provider may want to enter into a data clean agreement with Experian to enrich their customer profiles with psychographic data like shopping behavior. Leveraging the new enriched data points for segmentation, the credit card provider can then create custom audiences to target personalized campaigns across channels to drive personalization efforts.

Audience Analysis

Another major use case of data clean rooms is analyzing two separate customer data sets to drive actionable insights for one or both of the involved parties. Leveraging first-party data from multiple datasets, the involved organizations can perform deeper analyses to uncover insights like audience overlap to supercharge marketing and sales efforts.

For example, if a hotel and airline wanted to run co-marketing campaigns, they could stand up a data clean room solution that returned a list of customers that “overlapped” (e.g., customers of both parties). The two companies could then use this list of customers to determine the opportunity size of future campaigns or to personalize and co-brand current messaging.

Measurement And Attribution

Perhaps the most well-known use case of data clean rooms is attribution. To combat the threat and advantage of advertising “walled gardens” like Facebook and Google, major publishers turn to clean rooms to provide the same measurement and attribution features that the social and search giants have at their disposal. By analyzing the user overlap between conversions of advertisers and those with impressions within a publisher’s ad network, publishers can provide closed-loop measurement features to help advertisers understand their return on advertising spend (ROAS).

For example, if a direct-to-consumer skincare brand purchased ten million impressions from a publisher like The New York Times. In that case, they could leverage a clean room environment to look at the overlap between customers who converted on their site and New York Times users who saw an ad impression. These insights could then be leveraged with their specific attribution model to calculate the overall campaign performance.

Types of Data Clean Rooms

There are three primary types of clean rooms; “Media Clean Rooms”, “Private Clean Rooms”, and “Clean Rooms as a Service”. While running off similar basic concepts, it is essential to understand the differences between each.

Media Clean Rooms

The first, and most well-known clean room type is provided by the so-called walled gardens of AdTech (e.g., platforms like Meta and Google.) Specifically called “Media Clean Rooms,” these platforms have access to vast amounts of customer data and audience graphs from the largest advertising ecosystems on the planet. Media clean rooms like Google Ads Data Hub or Amazon Marketing Cloud each have full access to proprietary data sets that can be matched to first-party lists uploaded by advertisers. Aggregate reports can then be made available back to the advertisers to help them understand their campaign performance.

Private Clean Rooms

Private Clean Rooms” are provided by companies with massive amounts of customer data and content. Most of the time, this includes large publishers of content like Disney, platforms like Spotify, or even companies with an extensive portfolio of brands like Dotdash Meredith. All of these companies have one common factor: they have their own private, clean room technologies to monetize their customer base and act as their own walled garden advertising ecosystem.

Clean Rooms as a Service

Clean Rooms as a Service,” are provided and sold by AdTech and software vendors. These clean rooms enable any two partner companies to put their first-party data into a neutral room that can then be safely and securely shared with one another. Clean room Vendors like Habu, Snowflake, and Databricks are leading the market here, offering a “bring your own data” model to any organization looking to stand up their own clean room deployment.

Snowflake Data Clean Room Example

Depending on your specific needs, you may need one, many, or none of the above clean room types. Either way, it’s important to understand how and when to best leverage each option, as clean rooms are by no means a small investment, and they’re often seen as a significant privacy concern if not resourced properly.

What’s Next for Data Clean Rooms?

With the further deprecation of third-party cookies and new consumer privacy regulations on the rise, data clean rooms will become an essential tool in the advertising and marketing technology ecosystem. Gartner, a leading technology research and consulting firm, predicts that by 2023, 80% of advertisers with media budgets of $1 billion will utilize data clean rooms.

As the adoption curve makes a beeline up and to the right, clean room technology will need to evolve with it. Cloud providers like Snowflake and Databricks will rapidly evolve their product offerings to capture the rising demand, making their products more accessible, efficient, and secure.

Finally, clean room providers will need to integrate more comprehensive activation features to stay relevant and competitive. Consumer brands, publishers, and startups alike will embrace clean room technologies to derive deep customer insights and supercharge their marketing efforts. Once they do, they will want to seamlessly sync those insights back into the operational tools, marketing applications, and advertising platforms their business teams utilize every day.

Data Activation will become a key component of data clean rooms to help ensure that customers are maximizing their investments in this technology.

--

--