Zero Knowledge Architectures for Mobile Applications

This is a small handout doc following my talk “Zero Knowledge Architecture for mobile applications”, which I gave at several conferences during the autumn of 2017.

Sensitive data narrative

If you want to hear the story about sensitive data we have and how we use it, please read through the slides or watch the video.

We have sensitive data and we can’t avoid sharing it.

We cannot trust data providers by default and we don’t want to think about data security all the time. So we choose to retain control over sensitive data while we store and share it. How? Read below :)

slides
video from MobiConf ‘17

Zero Knowledge Architectures

ZKA is a design principle.

In simple words, everything you do on a Zero Knowledge system is encrypted before it is sent to the server and the key to the encryption is also never revealed to the vendor.

The first important principle of ZKA is end-to-end encrypted clients that perform crypto computations. The server knows nothing about the nature of the data. By the way, sometimes ZKA is referred to as No-Knowledge architecture.

Second, all operations are on encrypted data. It means that if you’re going to add a new record to a database, you should add it in encrypted form. If you want to share a piece of data, you should share it in encrypted form. You even perform a search inside the encrypted data.

These principles don’t add additional security in a common sense of this term, but rather guarantee that client-side encryption is used properly. Zero knowledge algorithms and protocols ensure that no keys, passwords, files, or any other sensitive material ever gets transferred in an unencrypted or reversible form. There is no point in time when encryption keys or unencrypted files are visible to the servers or service administrators.

Why mobile?

ZKA relies on cryptography and requires trust to the device that runs crypto code. Mobile has a quite trustworthy runtime environment, compared to the browser or most desktops. Still it is not entirely without risks.

Where is ZKA used?

In messaging, we can use end-to-end encryption. Apart from clients, nobody can read your secret communication. Clients have means to verify trust to each other and to the server to ensure that cryptographic protection is working properly right now. End-to-end trust without leaking anything to storage or transmission layer is the basis of ZKA. You already know many examples of E2EE chats.

In authentication, we can use interactive crypto protocols known as Zero Knowledge Proof protocols. ZKP enables two parties to compare a secret without exposing it, efficiently avoiding leakage of secrets during transmission.

What about sharing data?

We know how to collaborate securely when a document is one blob of data, but modern document is actually a large tree-like structure, so will everyone be able to see everything?

Let’s say, you want to share sensitive data with some users. The naive approach is to encrypt the data several times — once for each user, using their keys.

Encrypting shared data for each user is good, but naive approach

So if you are sharing data with five users, you need to encrypt the data five times with different keys and transfer these keys to each user. This approach leads to data duplication: you now need to store five times more and if you need to update the data or to change the access policy, you need to decrypt and encrypt some or all the records again.

Does this sound like ZKA? Well, yes. Does it sound easy? Not really.

But there are better approaches to data sharing: you can provide access to specific blocks of encrypted data for particular users. Accessing or encrypting or re-encrypting only what you need; and when you need to change access policy (who has access to what), you have to re-encrypt just one small access block.

As an implementation of such approach, we have an open source library called Hermes. It has C-core, and it is available for many platforms.

A better way of collaborating on sensitive data

Credit history example

Credit history is a beautiful example of sensitive data that is shared among multiple entities. Credit Bureaus store your credit history and send it to your banks by their request. A credit history has lots of sensitive details about your life and your interest, so it’s preferable to minimize its sharing.

Usually, credit history is stored in encrypted form, as a single blob of data, so when it is leaked, you lose everything. Splitting the file into separate encrypted blocks and allowing banks to manipulate only specific blocks minimizes the risks.

Please, see the slides or the video to grasp the whole example with credit history and a cat.

How to implement a ZKA kind of collaboration on shared data

Let’s describe the implementation of secure data collaboration.

Things ZKA relies on

Key wrapping

Split a single blob of data into smaller encrypted blocks. Decouple storage keys from the user keys, encrypt data with storage keys — they will protect the data.

Manage privileges

Store storage keys in containers; cryptographically control that users can change stored keys by comparing actual ownership tags (basically, authenticated MAC) to prove that they can perform an operation.

Control requests

Since the data is available only through the crypto layer, you should make sure that your architecture translates every operation over your dataset into crypto API commands; that no one puts data in a plaintext bypassing your crypto core.

Mitigate remaining attacks

Crypto is not a magic wand; it just narrows the attack surface. You still need backups; you need to use ZKP to protect against replay attacks, you need to ensure code trust, secure keys lifecycle, and use traditional things like intrusion detection, monitoring, and traffic inspection.

Other use cases for ZKA

You can use ZKA approach everywhere you want to provide access securely to small blocks of data shared among different clients.

  • complex documents with comments or detailed spreadsheets (i.e. Google Docs, Dropbox Paper, etc.). In many cases, users shouldn’t have access to the whole document.
  • file systems are a perfect example of small blobs of data, structured and shared with different access control rights.
  • document store protection. If every blob in a database is a protected one, and access rights are protected, then you get an end-to-end document store, where every document or field’s rights can be granted to everybody. Imagine MongoDB with custom queries on secure data for the untrusted web apps and trusted queries from your mobile apps.

How difficult is it to implement ZKA in your products?

Well, it depends. Of course, custom implementation will take a lot of time, but there are plenty of existing solutions:

Are there other implementations? Yes!

Check out LAFS. It is a secure storage system that stores your files encrypted separated by chunks on different servers.

Or check out ZeroKit that has integration with CareKit.

Recap

Zero Knowledge Architecture is a design approach that solves problems of trusting the server and the transmission environment. It’s rather easy to solve when interactions are simple and gets rather complicated when we’re talking about databases or collaboration. But it’s not impossible now: I’ve shown you some examples of solving everyday problems using ZKA. I believe that we will collaborate on data more and more with each year, and it’s wise to prepare our products and to protect our users.

More links to follow

Looking for something else?

Handy list of my other talks