An Introduction to PETs for Attribution and Reporting

Published in

Criteo R&D Blog

10 min readApr 25, 2023

Introduction

Context. To avoid direct cross-site tracking, several browsers (e.g. Mozilla, Google Chrome, and Microsoft) are developing attribution and reporting proposals that leverage the concept of private computation under the coordination of trusted helper servers. The latter concept aims at performing attribution and evaluating a pre-defined function (e.g. counting trigger events such as clicks) without revealing any new information to each party except the output of the aforementioned function.

Currently, two types of private computation systems based on privacy-enhancing technologies (PETs) are considered in browsers’ proposals. The first one, suggested in the Interoperable Private Attribution (IPA) by Meta/Mozilla, is called secure multi-party computation (MPC) [1]. The second one, proposed in the Aggregate Reporting API by Google, is based on trusted execution environments (TEEs) [2].

Objective. In this article, we will guide you through these two PETs. Our objective is to provide a high-level understanding of the technologies underlying current proposals for attribution and reporting use cases, discuss their benefits but also their limitations, and foster cross-collaboration to address the areas of improvement.

Related Works. As a leading AdTech company that drives commerce outcomes for media owners and marketers, Criteo is committed to evaluating proposals that might affect the way we will perform attribution and reporting in the future. Criteo has already participated in testing and providing feedback on browser proposals such as the Privacy Sandbox one; see our previous two articles on Topics API and FLEDGE:

This article stands as the first introductory blog post among a series of articles focused on evaluating MPC and TEEs, notably for attribution and reporting.

TL;DR

The usage of MPC or a TEE to perform private computation leads to considering two different trust paradigms. Regarding MPC, which involves multiple parties, we have to trust the latter to not share their private inputs among themselves. In contrast, building upon a hardware-based TEE avoids the necessity to perform multi-party computation; but trust is delegated to the hardware manufacturer in most cases.
A TEE involves a remote attestation feature that enables any party to audit the code (e.g. aggregation procedure for reporting) running inside it before sending data, contributing to ensuring the correctness of the private computation. This feature can be made available for MPC protocols if the latter is run inside a TEE.
Regarding the accuracy of the private computation, TEEs seem to stand for a better alternative compared to MPC. The scalability of such PET-based private computation remains detrimental for AdTech; it is for the moment not clear if TEEs and MPC yield reasonable computational and communication costs for standard use cases. This will be the focus of companion articles within this series.

High-level Design

Browser vendors’ proposals such as the Interoperable Private Attribution (IPA) by Meta/Mozilla or the Attribution Reporting API by Google Chrome, partially rely on so-called helper parties to instantiate a private computation procedure.

Delegating computation to these helper parties assumes that we can trust them in guaranteeing at least the following two properties, as proposed by the PATWG of the W3C [3]:

Private computation: Helper parties cannot access raw user-level data.
Correctness: Helper parties execute the prescribed protocol correctly.

In Figure 1 below, we illustrate the main differences between helper parties leveraging a TEE and those instantiating an MPC protocol.
With TEEs, only one helper party is involved. Encrypted reports gathered by AdTech are sent to the TEE, which decrypts them in a secure environment and performs attribution and reporting on plaintext data.
With MPC protocols, several helper parties are involved. Encrypted reports are split into several random pieces called shares (see paragraph on Secure Multi-Party Computation) and then sent to several helper parties which decrypt them and perform computation on random pieces of data.

Figure 1. Private Computation using PET-based helper parties.

In the following paragraphs, we introduce MPC and TEEs and explain how the latter is proposed to be leveraged to meet the private computation and correctness properties.

Trusted Execution Environment

A trusted execution environment (TEE) provides trust and confidentiality guarantees for data in use; similar to transport layer security for data in transit or encryption algorithms for data at rest. A TEE stands for an isolated processing environment, based on specific hardware, in which applications can be securely executed in isolation of the operating system or hypervisor as illustrated in Figure 2.

Thanks to the isolation property provided by the TEE, the code can be processed on plaintext data inside the TEE.

Loosely speaking, TEEs can be divided into two categories depending on what the TEE executes:

Process-based TEEs that only run a specific piece of code associated with an application in a secure manner;
VM-based TEEs that run the whole code of an application.

Figure 2. Illustration of the two main categories of TEEs (last two columns). The first column stands for a standard processor. “Trusted, but verified” means that trust is ensured via a verification process, called attestation.

Let’s now go back to the two PATCG properties: private computation and correctness.

Private computation is achieved by guaranteeing that parties running computations in a TEE cannot tamper with nor access the code and data loaded by the TEE. In practice, this property is partially ensured through two means:

reducing the communication interface, referred to as the trusted computing base of the TEE with the rest of the system so that the attack surface is significantly reduced;
encrypting values in memory.

Despite these properties, potential leakage might still be possible by performing side-channel attacks via, for instance, exploitation of access patterns. This will be further detailed in an upcoming article focusing on TEEs.

Correctness. On the other hand, correctness is achieved via a remote attestation feature which ensures (1) that the code being executed runs inside a genuine TEE instance i.e. created via specific hardware with confidential computing capabilities such as Intel SGX or AMD SEV; and (2) that the code is the expected one. Remote attestation is generally achieved by having the code digitally signed with a manufacturer-derived private key. Any party can then authenticate the code and verify it was not tampered with using the corresponding public key made available through a public key infrastructure.

Trust. As outlined in Figure 2, when using a TEE, trust is delegated to the hardware manufacturer in most cases, that is Intel for Intel SGX/TDX or AMD for AMD SEV.

Secure Multi-Party Computation

In the previous paragraph, we saw how a TEE could be leveraged to perform private computation within a single helper party on data collected by an AdTech company.

Secure multi-party computation (MPC) stands for another candidate to perform such private computation. In contrast to a TEE, MPC does not resort to a specific piece of hardware but only uses cryptographic protocols to ensure private computation. In addition, it leverages collaboration between several helper parties instead of resorting to a single one.

Private computation. Private computation is guaranteed by using a cryptographic technique called secret sharing. To be sure that data sent to helper parties will not be made available to them, this data (i.e. the secret) is first split into random pieces, called secret shares, such that we can reconstruct the original secret by aggregating them (e.g. via a sum). Only one of these random pieces is sent to each helper party; each of them being unable to reconstruct alone the initial data. Then, helper parties collaboratively compute the function of interest (e.g. a sum) by performing local computations on their random pieces of data and exchanging intermediate results between them.
In the IPA proposal, the AdTech provider is supposed to perform such secret sharing on encrypted reports before sending associated shares to helper parties.

Correctness. MPC protocols do not meet the aforementioned correctness definition but are a relaxed version of it. More precisely, MPC protocols are meant to output the correct result of a given function. However, there is no way to guarantee, by default, that the function evaluated by the helper parties is the prescribed one. To meet correctness, MPC protocols could be run inside a TEE to benefit from the remote attestation feature, but the costs of these two PETs will then sum up.

Threat Models & Trust. Delegating computation to several helper parties naturally raises the following questions:

What if helper parties collude to reconstruct my data?
What if a subset of the helper parties deviates from the prescribed protocol?

Regarding the first question, collusion between parties is a potential risk that must be taken into account since it impairs private computation property. There is no efficient way to cope with this problem except managing to find helper parties in which we can place our trust.

On the other hand, the second question brings forward the assumption that some helper parties might be malicious and aim at breaking the prescribed protocol to recover plaintext data from intermediate computations on secret shares. In the scenario described in the IPA proposal, malicious helper parties are assumed to be in minority (1 out of 3). Under this assumption, MPC protocols can be enhanced by so-called message authentification codes (MACs) to yield correct results; at the cost of additional multiplications to be performed.

Example. Figure 3 illustrates how to perform a simple private SUM query of two plaintext (for the sake of simplicity) integers, 8 and 15, using a simple instance of secret sharing. For each of these two numbers, the AdTech provider generates three random shares such that their sum equals the initial integer. A subset of these random shares is sent to each helper party, which computes intermediate sums. To obtain the final sum, all helper parties just sum up their intermediate sums to obtain the initial sum, which equals 8 + 15 = 23. Note that each helper party does not have access to the initial two integers.

Figure 3. Illustration of a simple secret sharing scheme between three helper parties to compute a sum of two integers. The dice are used to refer to random numbers.

Benchmark

In this section, we provide a high-level benchmark between MPC and TEE with respect to key principles that should be verified when addressing attribution or reporting use case: accuracy, flexibility, and scalability.
In current browser vendors’ proposals, campaign optimisation via machine learning (ML) training is meant to be performed after noisy and aggregated data collection via a reporting API. In this benchmark, we also compare MPC and TEEs to perform campaign optimisation directly via helper parties. This use case could be envisioned in future versions of browser vendors’ proposals since it allows to use off-the-shelf ML algorithms that take as input non-aggregated features and labels.

Accuracy

In Table 1, we compare MPC and TEE from the accuracy lens. MPC protocols are based on two main primitives namely private addition and multiplication.
For simple reporting use cases, such as COUNT queries as proposed in the IPA proposal, we can leverage efficient private addition protocols to address them using MPC. For more advanced use cases such as campaign optimisation via machine learning (ML) training, we have to deal with non-linear functions that have to be approximated, in contrast to the use of a TEE. This approximation might lead to a performance drop for ML models.

Flexibility

In Table 2, we compare MPC and TEE from the flexibility lens; low flexibility meaning that associated algorithms may be cumbersome to design or often come with additional costs that must be paid.
Regarding the TEE, the use of an attestation mechanism to audit the code running inside it might lead to low flexibility despite meeting correctness. Indeed, if the browser vendor proposes to AdTech vendors the possibility to use custom SQL queries and ML training pipelines, then the former has to audit associated codes and verify that differential privacy is indeed met at the prescribed level. On the other hand, if the browser vendor proposes pre-defined templates to perform SQL queries or ML training to limit the code audit overhead, this yields a few degrees of freedom for AdTech to innovate.
For MPC, the lack of flexibility comes from (1) the use of additional protocols such as oblivious sorting to prevent information leakage by obfuscating the sorted location of any report, and (2) the necessity to provide efficient cryptographic primitives (e.g. private addition, private computation of the sigmoid function) to AdTech vendors to encompass a variety of private computations.

Scalability

In Table 3, we compare MPC and TEE from the scalability lens.
Due to multi-party (but not independent in general) computations, MPC might suffer from large communication costs. These costs will be presented in thorough detail in a companion article focusing on MPC in practice for attribution, reporting, and campaign optimisation. In addition, because of the use of secret sharing, more local computations within each helper party is required (e.g. 2 sums instead of 1 sum in Figure 3).
Regarding TEEs, the computational overhead is mainly associated with encryption and memory access costs which vary from one application to another. They also depend on the TEE technology (e.g. AMD SEV or Intel SGX). More details will be provided in a companion article focusing on TEEs.

What’s Next

As pointed out in the Introduction, this article is part of a series of articles focused on the analysis of helper servers enhanced by MPC or TEE, for attribution and reporting use cases. Therefore, other articles will follow to complement this introductory content.