Some takeaways from PEPR’24 (USENIX Conference on Privacy Engineering Practice and Respect 2024)

Published in

PrivacyCloud

4 min readJun 6, 2024

We very much enjoyed this refreshing, highly-focused, extremely informative conference. A few notes from the specific talks follow below, having shared them previously in a more succinct form (on different social media networks).

Damien Desfontaines (Tumult Labs): Synthetic data is not the privacy-preserving silver bullet that many believe. Vendors should share empirical metrics, test for reconstruction and target outliers. (Damien is a former Masters of Privacy guest.)
Yuchao Tao (Snapchat): Unlike other social networks, #Snapchat does not share a member’s social graph. Snap leverages differential privacy to make sure that this is not compromised while suggesting new connections. It is still not risk-free, though (as made clear during the FAQ). #ABSet
Jordan Brandt (Inpher): Combining PETs (rather than picking one) is the way to go if we want to provide privacy safeguards while achieving high precision. Keeping data encrypted for as long as possible (eg., in Trusted Execution Environments) and adding noise (Differential Privacy) at the very end helps maintain such precision.
Jian Du and Shikun Airin (TikTok): Applying Secure Multiparty Computation alone provides “input privacy” in ad performance measurement, but does not guarantee output-related safeguards, so #DifferentialPrivacy should be applied on top.
Matt Gershoff (Conductrics): We need to be intentional about the data that we collect. This boils down to discarding “magical thinking” about the abstract possibilities of collecting data without control. Minimizing granularity and “linkability” in AB Testing can deliver k-anonymity, providing efficient computations/storage and beyond, as well as guaranteeing alignment with privacy by design and privacy by default principles.
Daniel Simmons-Marengo (Tumult Labs): How can we provide anonymization while it remains such an elusive target? Be skeptical about it, minimize assumptions, do not rely on governance, build with composability in mind and expose your technique to others.
Jessica Colnago and Simon Fondrie-Teitler (Federal Trade Commission): Careful with false and misleading representations regarding the use of PETs in your tech stack. The FTC has already come after Henry Schein (2016), Zoom (2020) and CafePress (2022) for this (ie., not keeping the show-offs in marketing under control)
Wendy Seltzer: Our many efforts to keep personal data closer to us in a more human-centric way (eg., pods, personal data stores, etc.) have floundered. In hindsight, identity is relational — you cannot be a guest without a host, a patient without a doctor…Many of our attributes originate in a counterpart and those at each end of the relationship may have different goals. This requires real governance.
Arthur Borem (University of Chicago) shared a really interesting study on the user experience of data subject access requests (#DSAR). Users are both unable to explore the files they are given (Json!) and overwhelmed by their size and depth. They are in fact looking for a direct answer to: What is being stored (eg., payment details); How it’s being stored (eg., retention period); and How it’s being used (eg., ads, sold to others). They are also hoping for high-level takeaways, but these are never on offer. It is likely that in an attempt to ensure “machine-readability” (in line with the separate right to portability), human accessibility has been left behind.
Aziel Epilepsia, Fernando Rubio, Ansuman A. (Airbnb): The Airbnb team was forced to build their own Consent Management Platform after verifying that expensive off-the-shelf tools did not really allow them to implement legal requirements in a consistent way. #CMP vendors would also fail at properly supporting different regions or easily onboarding new features, considerably delaying product iterations.
Lisa LeVasseur, Bryce S. (Internet Safety Labs): The ISL team has managed to measure and benchmark privacy risks across EdTech mobile apps at scale, creating an SDK Risk Score and providing their own Safety Labels. (Lisa is a former Masters of Privacy guest.)
Ryan Guest (Amazon): When it comes to deploying the Global Privacy Control signal (Universal Opt-Out) across a very large amount of digital properties, touchpoints and devices (either server-side or client-side), different teams may come up with highly-creative, and also highly-inadequate formulas. (Orchestration, training, and guidance seem particularly important in an organization of this size.) #GPC
Tamara Bonaci: More work should be done on Machine Unlearning as a means to remove the effect of certain pieces of personal data on a trained algorithm once such records have been deleted (eg., at the request of data subjects). MU is already working for smaller models.
JiSu Kim, Alex Lambert, Francesco Logozzo: The Meta team has worked on Lineage Quality Measurement to better automate internal data flow mapping efforts (for sensitive personal information). Recall and Precision metrics have proven particularly effective.
Kien Nguyen, Chen-Kuei Lee (Meta): The majority of data analytics at Meta run on Presto (open source). SQL queries were re-written to incorporate differential privacy via noisy aggregations.
Akshatha Gangadhariah, Bijeeta Pal, Sameera Ghayyur (Snap): The MyAI Chatbot team at Snap has built an in-house solution for personal data masking that goes beyond Presidio’s own abilities, while relying on open source libraries to detect PII.

(Apologies, we could not easily summarize every single talk.)

Hopefully seeing you all again next year. #Privacy #PrivacyEngineering #PEPR

Some takeaways from PEPR’24 (USENIX Conference on Privacy Engineering Practice and Respect 2024)

Written by Sergio Maldonado