Finding The Common Customers of Two or More Organizations While Keeping Personal Data Secret (Part 3)
Part 3/3 — Practical Usage
Previously
This is the third of the three articles where I discuss the the problem how to find who are the common customers of two or more companies without revealing any information about these customers. The full whitepaper can be found here. And if you want to read the introduction to the problem read the first part first. And if you want to see the algorithms for private set intersection read the second part.
Mimirium PSI Command Line Tool
In Mimirium we develop software system successfully solving the private set intersection problem for two and more parties using some of the described techniques. Unlike other systems which require participants to share data, even encrypted, with them and process it on their servers, our approach allows each party to run its own infrastructure. This complies with complex legal or company regulations and protocols.
The system is used with a command line tool supporting all major platforms Linux, MacOS and Windows.
Features
The following features are included into the tool:
- Support for standard data fields such as name, phone, email etc.
- Automatic data unification using machine learning
- Support for different protocols
- Support for direct communication between parties (no 3rd party)
- Support for assisted communication between parties (with 3rd party, where the 3rd party server could be one of the sharing parties or an independent one)
- Data can be sent in a file in different formats like CSV or from the stdin
- Password or token protection
Tutorial
Let’s have two organizations (insurance companies) having list of customers. They have exported their list to CSV as shown below:
Next the parties A and B need to start the server and share their encrypted customers list. We’ll demonstrate two scenarios, one with and one without 3rd party server.
PSI with Direct Connection using RSA protocol
Party B now has the intersection (common customers) in psi.csv like this:
PSI with a Server using FHE protocol
Party A and B now have the same intersection (common customers) in their corresponding CSV file like this:
Summary
The demonstrated protocols and algorithms successfully solve the problem of finding the common customers among two or more private lists. There are many details of course not covered in the paper though the main requirements are satisfied: correctness and security.
The tool developed by Mimirium uses these techniques in different combinations to achieve high quality results. For more info about our products contact us at info@mimirium.io.
References
Hao Chen and Kim Laine and Peter Rindal, Fast Private Set Intersection from Homomorphic Encryption, https://eprint.iacr.org/2017/299
Benny Pinkas, Thomas Schneider, Christian Weinert, and Udi Wieder, Efficient Circuit-based PSI via Cuckoo Hashing, https://eprint.iacr.org/2018/120.pdf
Emiliano De Cristofaro and Gene Tsudik, Practical Private Set Intersection Protocols with Linear Complexity, http://sprout.ics.uci.edu/pubs/practical_private.pdf
Benny Pinkas and Thomas Schneider and Michael Zohner, Scalable Private Set Intersection Based on OT Extension, https://eprint.iacr.org/2016/930.pdf
Agnes Kiss and Jian Liu and Thomas Schneider and N. Asokan and Benny Pinkas, Private Set Intersection for Unequal Set Sizes with Mobile Applications, https://eprint.iacr.org/2017/670.pdf
General Data Protection Regulation, https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en