Developing the next generation of personal privacy, decentralized search, and digital advertising is an exciting undertaking bridging the gap between theory and practice. This fall, we announced a research and innovation partnership between BitClave and the College of Engineering’s Information Networking Institute at Carnegie Mellon University. This partnership brings together the engineering team at BitClave with information security graduate students at CMU Silicon Valley to collaborate on the latest ideas shaping blockchain security. The project team is lead by CMU information security graduate students Simran Gujral, Saurabh Sharma, and Wei Min Teo in collaboration with BitClave team members including Alex Bessonov, Patrick Tague, Mark Shwartzman, and Emmanuel Owusu. Here’s an update on this collaboration.
The project team has been working on several technologies that extend a distributed public ledger into an anonymous activity ledger including selective anonymity, selective linkability, and zero knowledge transaction protocols. Services on the BitClave Active Search Ecosystem (BASE) are implemented as cryptographically protected stateful smarts contracts stored to a blockchain network. The team is designing protocol specifications for these security properties and developing and testing on clean slate Ethereum test networks. In this post, we will delve into these transformative technologies.
Blockchain is a data store that provides anonymity even as all transactions are verifiable as authorized and correct by the public. Technically this property is pseudonymity since all blockchain implementations must use some form of identifier to provide the property of authorization — i.e., anyone can create a blockchain address (a pseudonym) but it is prohibitively expensive to prove that an address belongs to an individual without corroborating evidence external to the blockchain. For the remainder of this post, we will use the term anonymity a bit loosely here to emphasize the point that we are most interested in protecting against the disclosure of real world identities. We will also make use of the terms users and pseudonyms to refer to blockchain addresses. Finally, we consider side channels (such as inference based disclosure models) and user-facing software compromises (such as the leaking of a private key that binds a real world identity to a blockchain address or transaction) as out of scope for this exercise although we take extensive measures as a company to minimize these sorts of vulnerabilities as well.
Traditionally, there are no additional benefits to providing the capability to prove that a subset of transactions issued by different addresses on the blockchain belong to a single individual — linkability. To the contrary, linkability can actually make it easier to uncover real world identities via side channel analysis. However, in the case of the activity ledger, linkability is desirable because a real world user’s ranking and data-use payments will generally be higher when their activities are grouped together. And so we have these two competing properties of user anonymity and transaction linkability that we would like to support throughout the BASE protocol stack as well as defer to user choice through a set of simple controls.
Anonymity ensures that each BASE user may contribute to the activity ledger without revealing their identity to observers of the blockchain. Selective anonymity enables software endpoints, which may be managed by users themselves or managed by the businesses users interact with, to anonymize data contributed to the blockchain and flexibly manage pseudonyms. Selective linkability enables users to prove to a business that a set of blockchain transactions are associated without disclosing this association or binding the association to an identity on the network. Taken together, selective anonymity and selective linkability provide a flexible mechanism for users to manage the tradeoff between the potential for a higher data valuation with the potential for observers of the blockchain to profile or infer their identity.
For illustrative purposes, let’s consider three hypothetical BASE users: Alice, Bob, and Carol. Alice enjoys the offerings available on the BitClave search app and likes that she earns every time she uses it. But for Alice, her favorite feature of the BitClave network is that she can control how her personal information is used. Bob also enjoys the ability to manage access to personal data but is happy to share a more complete personal profile with service providers on the network in return for increased data use payments and more personalized offers.
Alice and Bob express these preferences by prioritizing selective anonymity and selective linkability, respectively. Note that irrespective of linkability and anonymity preferences, both Alice and Bob benefit from the additional layers of data protection that secure the whole network and limit the resolution of information shared to what’s needed to complete authorized tasks (e.g., encrypted fields, confidential transactions, and privacy-preserving collaborative deep learning). Also note that each real world identity typically corresponds to many pseudonyms on the network — with the strongest level of anonymity actualized as a unique pseudonym for each contribution to the ledger (i.e., the initial user configuration matches Alice’s preferences and ensures unlinkability).
On a particular day, Alice searches for the latest home tech gadgets using pseudonym A1, then searches for holiday gift ideas using pseudonym A2, then searches for wireless speaker systems using pseudonym A3. At a later time, she can privately reveal to a selected business that sells both tech products and wireless speaker systems that both A1 and A3 belong to the same user, without revealing that user is Alice. Similarly, Bob can prove to any party that his anonymized set of activities actually do belong to a single user, ensuring linkability.
Finally, Carol would like to enjoy the benefits of personalized offers without having a complete personal profile tied to a single pseudonym, ensuring selective linkability. Carol can specify this preference through the use of time-based linking (snapshots, e.g., Carol can link only the last week or six months worth of data) and activity-based linking (personas, e.g., Carol can link all activities related to travel as one persona and all activities related to new car search as another).
The introduction of the Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (or zkSNARKs) efficiently extends zero-knowledge proofs to blockchain technology. Zero-knowledge proofs allow a verifier to verify the correctness of a computation without having to learn what was executed. In the context of a cryptocurrency blockchain, this enables miners (the verifiers) to answer “Does Alice have enough units in her balance to complete this transfer to Bob?” without Alice (the prover) having to disclose her balance or how much was transferred. In the context of the BASE activity ledger, this enables users and businesses to complete actions like issuing and accepting offers, without the miners or any observer of the network learning any details of the transaction other than the fact it was done correctly.
That is, the activity ledger can enforce honesty while providing privacy. Further, these properties are provided on-chain enabling the BASE protocol stack to support confidential transactions without a central authority — a win in terms of reconciling privacy and utility. The biggest technical challenge of extending zero-knowledge proofs from a cryptocurrency ledger to the activity ledger is the added complexity in terms of the types of supported computations. In general, activity ledger transactions involves more off-chain computations (supporting the end-to-end product use case) that need to be specified such that what is provable on-chain propagates the security properties to the entire product use case.