A brief history of modern encryption
For many people, cryptography is associated with encryption. An encryption algorithm (or a tool, or a method) allows to scramble messages. The output of this algorithm is guaranteed to reveal (essentially) no information about the original message. Only with the knowledge of the secret key, the original message can be recovered. Cryptographers added sophisticated features to these algorithms to protect against very powerful adversaries (e.g., forward secrecy, key-leakage, etc.). These algorithms serve as fundamental building blocks in protecting our digital lives. For example, we use encryption in transit when we perform banking transactions to protect data in motion. We can also use encryption at rest to protect data stored on mobile and laptop devices.
Encryption is a fundamental building block
Building secure system is hard. One must consider many attack vectors, originating from multiple devices, networks and systems. Encryption is a building block that can help protect data at rest and in transit. Why is it so useful? Because it allows to simplify the study of security. Instead of studying all possible attack vectors on insecure communication channels (or storage devices), one may only focus on study of the encryption algorithm used, and protection of the key (I will not go into the challenges associated with either one of these tasks. They are far from trivial.).
In the remainder of the post, indented for general technical audience, I would like to summarize the evolution encryption is currently undergoing. Savvy readers can skip to the concluding remarks.
Traditional encryption is unfortunately “not useful” — we cannot “do things” with it. For example, suppose we encrypt our emails in Gmail using standard AES algorithm. Then, to search the emails we must decrypt them first. This, in turn, means that we need to download the encrypted emails on our personal devices from Gmail. But downloading Gigabytes of emails is infeasible and impractical. Can we do better?
Fortunately, cryptographers thought about this problem. Over the last decade, we worked on algorithms that help us do useful things with encrypted data (a.k.a. encryption in use). Both theoretical and applied cryptographers are actively working on a variety of such algorithms. Let me list a few of them.
Some of the earliest (efficient) schemes that allowed searching over encrypted data were presented by Song, Wagner, Perrig (2000), and Curtmola, Garay, Kamara, and Ostrovsky (2006). These schemes followed by many works from the community that improved security, performance and extended query complexity. We can now perform searches based on keywords or boolean expressions. These schemes are practical for many applications and getting deployed in small start-ups and large tech companies. I refer to a great post by Seny Kamara on searchable encryption.
Homomorphic encryption allows to compute arbitrary programs over encrypted data. For instance, suppose we encrypt a dataset D of financial transactions, store the ciphertext C = Enc(D) on a remote server, and keep the secret key private. Then, the remote server can run a machine learning algorithm P over the ciphertext C to obtain a ciphertext C’ = Enc(P(D)). We can now download C’ from the server, decrypt and learn the output of the machine learning algorithm. Note that we never have to share the secret key with the remote server. See Matthew Green’s high level post on homomorphic encryption.
Unfortunately, homomorphic encryption is only practical for few types of programs as of today. The limitations come from both technical and theoretical barriers of the schemes. Technically, we do not have a candidate scheme that we can call practical. Theoretically, security provided by homomorphic encryption is too strong for some settings: it guarantees that nothing, not even the run-time, can be learned about the data by a remote server. This essentially means that one must execute the program in its worst-case (that is, for any input the server must run the program for as long as the program would run on the worst possible input).
Similarly to homomorphic encryption, functional encryption allows to compute arbitrary programs over encrypted data. However, it also allows to restrict the server to which programs it can compute, and control what the server can learn about the input. This is achieved by issuing a special program specific secret key to the remote server. In its standard definition, a secret key for a program P allows the server to learn P(D) in clear. One might also issue a secret key for a program P’ that allows the server to learn, for example, an encryption of P(D).
Functional encryption addresses two problems of homomorphic encryption: (1) it allows to restrict and control which functions the server can compute, and (2) it allows to selectively let the server learn some information about the data. The later feature may be a key to improving performance of computing over encrypted data, as one may now allow the server to learn the run-time of P(D). (Hence, the server would not be required to run the program P in its worst-case).
I’m not aware of any practical implementations of general-purpose functional encryptions. These schemes, however, could be practical in the future for some types of computations because we can build them from well-studied (and somewhat efficient) cryptographic multi-party computation protocols.
Finally, let me mention some theoretical research developments that allow to “encrypt programs” while preserving their functionalities.
Indistinguishability program obfuscation
This is a new beast in cryptography. On the high level it allows to take a program P and scramble it to obtain a program P’ such that:
- P’ can be executed on any input D to get the output P(D), just as the original program.
- P’ can hide some secrets of P (e.g., secret keys) within its description.
This beast is so powerful that, in theory, it can be used to construct almost any other cryptographic algorithm. However, the beast is not ready is to be released in the wild: we do not yet have good candidate schemes that are practical. Moreover, indistinguishability obfuscation has some theoretical limitations. It does not hide everything about the original program (this is, in fact, impossible). Suppose we were given the best practical indistinguishability obfuscator. If we applied it on some real-world program P, then we would not be able to say a lot about what the obfuscated program P’ hides (or leaks) about the original code (without deep analysis, at least).
- We can do useful things with encrypted data. Some of these things are practical and getting deployed in the real world. Other things (such as computation of arbitrary programs) need more theoretical research.
- Doing things over encrypted data is great: it can improve performance and enable new functionalities. But there is a price for everything. In this case, the price can be measured by information that the remote servers can learn about the data. For example, if we enable searching over encrypted data, then we might leak some statistical properties of keywords within stored documents. In some cases, leakage might be harmful. One needs to be very careful and understand implications when enabling new functionalities over encrypted data. But we should not be scared of this. The goal of encryption is to help us reason about security of systems. Encryption in use serves the same purpose.
- Many encryption in use algorithms can be instantiated from standard encryptions (e.g., AES). No new math. They consist of very clever techniques manipulating data structures and exploit specific models of computations.
Remarks for the experts
- Cryptography is associated with encryption. Encryption is associated with perfect secrecy. As we shift towards encryption algorithms that might enable servers to learn some information about the data, we must be very clear and open about it. It is our moral obligation to explain, and not to confuse. Perhaps we need better names for some of these primitives.
- Encryption in use will be deployed in the real-world. But not without a fight. The ball is in the hands of applied cryptographers to deliver it. I believe theoretical cryptographers should embrace and better appreciate efforts of the applied cryptographers, and vise versa. The two, largely distinct, communities can greatly benefit from collaboration and serve some good to the society.