Hashed Interview: Jung-hee Cheon, SNU/Crypto Lab, “Homomorphic Encryption & Blockchain”

HASHED
Hashed Team Blog
Published in
13 min readMar 22, 2019

“Homomorphic encryption’s role in this is to acquire this data in encrypted form and analyze it without breaching personal privacy. That’s why you could work with medical or financial data without invasion of privacy and obtain useful results, and this is the advantage of HeaAn Library where approximate calculation becomes useful.”

Hashed CTO Ethan Kim talked about why homomorphic encryption is essential in the blockchain industry with Jung-hee Cheon, a professor at the Mathematical Sciences department of Seoul National University, and the head of the Industrial Mathematics Center.

Hashed Interview: Prof. Jung-hee Cheon, Homomorphic Encryption & Blockchain

First off, welcome to Hashed, and thank you for meeting with us.

Thank you for the invite!

Before we move on to other questions, could you introduce yourself and talk a little about your area of research?

I am Jung-hee Cheon, a professor at the Mathematical Sciences department of Seoul National University, and the head of the Industrial Mathematics Center. My area of research is number theory and number theory applied to cryptology.

Thank you. Now, there aren’t many who are focused on cryptology in academia — could you tell us how you got into the field?

It was in 1997 after I had gotten my Ph.D. Before then I’d studied number theory and elliptical curve theory. Elliptical curve theory was getting a lot of attention because it was being used in cryptology, and they needed a specialist because it was unfamiliar to most cryptologists. The recruitment notice caught my attention and I sort of stumbled into it, got attached because I liked the idea of my theories having real-life applications, unlike in pure mathematics. So I did research for both mathematics and cryptology, but more so for the latter.

Seoul National University Logo

There are a lot of industries, and a lot of ways in which cryptology could be used. What kinds of potential does cryptology hold in various industries?

In the past, cryptology functioned as a security tool or module. It was integral to security technology. Lately, I think we’re moving into an age where data itself becomes an asset. We talk a lot about the fourth industrial revolution and big data, and a big part of that is data becoming an asset. And cryptology is best for protecting data assets. If cryptology played a side role in the past, lately it’s been playing a more active role. And with big data analysis, homomorphic encryption has been called the big data cryptology, and cryptology has been contributing more to the world.

You got a lot of attention for being the first to crack cryptological multilinear functions. What does this mean to you?

For cryptologists, the ideal would be to create a function or encryption that could generate all of the other encryptions, a cryptological equivalent of stem cells. And cryptological obfuscation does just that. Obfuscation was used widely in industries as well. For example, program obfuscation is used to control copyright, because there are certain parts in a program that control this part. For example, the Windows program requires a Windows license number to operate, and there’s a certain routine in the program that checks for this number. If anyone could access this routine, it would be easy for them to manipulate the program to bypass this routine. Then copyrights would be very difficult to protect. Obfuscation is also used for other purposes than copyright protection. But aside from this, it was theoretically interesting that if we could theoretically materialize the program obfuscation, we could also create all the things we want in cryptology, such as symmetric-key algorithms, signatures, verification, even homomorphic encryption. But it would be extremely difficult or even impossible to create this. There were attempts to prove that it was actually impossible a decade ago, and some succeeded. Then in 2013, the first cryptological obfuscation technology with some potential was discovered, and academia got very excited. A few months after this, there were hundreds of research papers using this technology to create various encryptions. I assigned one of my own students, Changmin Lee, to mathematically analyze one of these problems, and after some difficulty, he managed to get a lead, and eventually managed to prove his analysis. This was presented at Eurocrypt and won Best Paper, so I think of it fondly.

Prof. Cheon Interview at Hashed Lounge

That’s great. You mentioned homomorphic encryption before. There are of course a lot of different areas in cryptology, and I heard that your main focus is on homomorphic encryption. This is an unfamiliar term to the layperson. Could you give us a brief explanation?

The first time the concept of homomorphic encryption came out was in 1978, so about 40 years ago. But the first secure homomorphic scheme came out in 2009, exactly ten years ago. We have made a lot of progress since then. If we think about how homomorphic encryption is used nowadays, the most appropriate would be in big data. Homomorphic encryption is different from traditional encryptions in that the latter is more like a locked safe that holds data, and only those with the key could take out and see the data in the safe. Homomorphic encryption doesn’t allow anyone to take out the data, but you could touch the data from the outside because the safe holding the data is malleable. Anyone could access the data from the outside, modify it, and take out only the modified results. The original rigid safe requires us to take out all the data in the safe in order to analyze it. Homomorphic encryption lets you do the analysis inside of the safe and access only the result, so it’s gotten a lot of attention for the encryption to save the problem of information privacy in data analysis, which happens a lot in big data manipulation.

Crypto Lab, which you founded, has developed HeaAN, an algorithm with groundbreaking improvements on the homomorphic encryption algorithm. Could you explain how HeaAN differs from other homomorphic encryptions, and why it has garnered such international attention?

I talked about the safe example before. You could manipulate the data within the malleable safe with, say, a kind of glove. The original homomorphic encryption gives this glove two functions: adding and multiplying. And theory tells us that if you can conduct adding and multiplying in bits, you could carry out any calculation that a computer is capable of. This is also known as the Turing Complete. So we’ve proved that homomorphic encryption can do anything, and so we tried to multiply and add 32-bit data, which works. The problem is that it’s too slow. This is because as you multiply the data, the number gets bigger and bigger. It seems obvious. We looked at how the real world deals with this, because in the real world, the number would get two, four, eight times bigger and become unmanageable. We realized that in the real world, we round off. After calculating, we discard the unnecessary numbers, that aren’t significant figures, and use only the short numbers to calculate. Of course, the computer can do this — it’s just that it’s too slow. So we made homomorphic encryption that can round off. So we have two gloves — one for adding and one for multiplying. It goes in the scissors, for rounding off. With these three, you could calculate more efficiently. And there have been a lot of developments on machine learning and artificial intelligence, which typically use approximate calculation. When you use approximate calculation, you need to round off every calculation, and we do that, so in areas that need approximate calculation, we believe that homomorphic encryption could do better than any other.

Cryptology Image

The scissors you mentioned are used for rounding off. The layperson who doesn’t know much about computers would think that for computation to work properly, you would need to get every number exactly right. You said that approximate calculation works just as well — in what industries would approximate values be enough for proper operation?

I said before that I majored in number theory. Number theory, in mathematics, deals with algebra. On the other hand, analytics deals with approximate calculations. When you’re studying algebra, you don’t deal with approximates, or errors. Even the tiniest difference makes you think that the problem is off, and that’s why I used to think that calculations without errors were the majority in the world. Any data in the real world, not in a computer’s calculations, is really an approximate value. Numbers in the real world are real numbers, not whole numbers. To enter real numbers into computers, we enter a finite number of ciphers, so we use an approximate value. So most of the calculations made in the real world are approximate from the start. I overlooked this because I’m rather partial to algebra, but working in this area, I realized that a lot of calculations are approximate. A prominent example might be calculations used in machine learning or data analysis. These approximate calculations are good enough because the numbers in our world tend to converge. The systems in the world are inclined to converge stably, so even if the original value changes a little, it would just incline to come back to the original number. A lot of systems are like this. Some aren’t, but those systems in the application are accurate enough.

What industries that allow the use of approximate calculation are actually leaning towards that direction?

The first thing that comes to mind is drones. Their control system operates by reading real-life data on computers and issuing the next control command. These systems are the kind of stable systems I was talking about before. If these systems homomorphically encrypt their data, you could operate them while it’s encrypted. For example, you control the data on your drone to fly it. With this encryption, you could prevent someone else from hacking and stealing your drone. But this is easy. The cyber-physical system or physical systems operated by computers, have the inclination to converge that I talked about before, so approximate calculation works well in this case. And there are many other areas that require further security, and data that requires analysis are mostly inclined to converge. In other words, if results changed drastically when we changed the data just a little bit, it would be difficult to analyze this data and it would be hard for the results to be meaningful. As this is not the case with actual data, financial data, medical data, approximate calculation works just as well. To speak on this a bit further, the big data industry needs two things: a way to acquire data and a way to analyze this data. A lot of people focus solely on data analysis technology, but the most difficult thing in big data is acquiring data. If you could acquire data, you could get a lot out of it with the simplest analyses. Homomorphic encryption’s role in this is to acquire this data in encrypted form and analyze it without breaching personal privacy. That’s why you could work with medical or financial data without invasion of privacy and obtain useful results, and this is where approximate calculation becomes useful.

We can’t ask you about blockchain. Cryptology is a hyped-up term even in the blockchain industry. What could we expect from applying homomorphic encryption to the blockchain, and wherein blockchain could we apply it?

Blockchain started with Bitcoin. It actually had a pretty simple function in the blockchain, namely transactional. What blockchain has moved onto since then is as a data sharing platform, which I think is an excellent vision. The only problem with that is how we’re going to deal with the issue of privacy. There’s data in the world that doesn’t require much privacy, but the more private and valuable some data is, the more it’s related to people, and there are certain breaches that happen. Blockchain, as you know, is a very transparent platform. It shows you all its data, and you can’t manipulate or delete it, and it will probably stay long into the future. This means that even the slightest details, like where I was at a certain point at a certain time, could be an issue. This is why ways to protect privacy while sharing data has become such a hot topic in the blockchain industry, and homomorphic encryption has emerged as the most promising solution to protect and analyze data.

Blockchain Image

The issues you mentioned have more than one solution, and homomorphic encryption is one of them. Could you tell us about the characteristics of homomorphic encryption and the benefits or drawbacks it might have as opposed to other methods you’ve mentioned, and how it differs in real life applications?

I gave you four examples of technologies that could be used in analyzing and using data. To get more into this, de-identification allows you to use data without knowing who owns the data, but as you accumulate more and more data, the chance of re-identification gets higher. Blockchain wants to accumulate more data, so this is risky. Differential privacy randomizes data while protecting its statistical characteristics, by giving the data some noise so that it can’t be identified. This means you have to add further noise for new data. This is efficient if you want to get just statistical characteristics from certain data, but if you want to add further data or get more sophisticated results, like with machine learning, there are limitations. The closest you get to homomorphic encryption is multi-party computation. Secure multi-party computation is like twenty questions. Two participants exchange the information they have without actually giving the answer. Multi-party computation, like homomorphic encryption, guarantees full security. But this requires a great deal of communication, like twenty questions. And this is protocol operated by computers, so it’s not just twenty steps — more like two hundred thousand, or two billion questions. This is what holds you back. And with blockchain, you have a lot of users. Because there are a lot more people, multi-party computation is difficult, and we don’t really have an idea about how to go about this problem. Homomorphic encryption, on the other hand, has a lot of computation, but once data is encrypted and uploaded, anyone could work with this encrypted data and decipher it only after you’ve gotten the answer, so we think it’s the best way to share and use data on blockchain, for now.

There are of course a lot of obstacles to overcome for homomorphic encryption to be popularized in everyday life. What kind of issues would you need to overcome for homomorphic encryption to be widely used?

If homomorphic encryption has strong security, it’s also quite inefficient. Calculations are much slower than if you used data without encryption. If we had better computers, the computation process would become faster, but more than that we need algorithmic development. When the first computer came out, computations were too slow because the machine was too simplistic to deal with sorting and factorization, but it was more a lack of a good algorithm that could solve these calculations. Historically, this had a lot of progress in the 70s and 80s. But homomorphic encryption is a kind of new computer, and so for it to run speedy computations, it needs a new kind of algorithm. This kind of research is only done by a few people yet, and so this algorithmic development is slow at the moment. But there has been a lot of attention on homomorphic encryption not only in blockchain but in other data analysis as well, so I think that this increased demand will lead to more research done in this area, and that will help homomorphic encryption speed up. To give you an example, the slowest computation in homomorphic encryption is bootstrapping, which is a kind of reboot. This was first created in 2011, and it took half an hour to reboot one bit. After 6 years of research, this dropped to a few milliseconds, and we haven’t released our paper yet, but our lab got it down to half a millisecond, so it’s become about a million times faster. A million times faster in 8 years means it’s increased eightfold per year, and about 500 times faster in 3 years. This isn’t a hardware development but an algorithmic one, and if we keep going like this, algorithms will become faster. Even now, as of 2019, homomorphic encryption has become fast enough for practical application.

You said that functionality relies on algorithmic development. Functionality could become an issue depending on what kind of problem you want to solve — for example, you spoke about evaluating credit, and even if this takes some time, if you solve the problem and if you supply enough data, you could say that you’ve solved the original problem. Other than functionality, are there regulation issues or privacy protection issues?

Technically speaking, if you upload encrypted data to the cloud and do your analysis there, and if you could download and decrypt this, and you have secure homomorphic encryption, there’s no risk of personal information leaking. But I think it takes time for the law to get in line with technology. It takes time for technology to be fully embraced by people and lawmakers. I do believe that if it’s a sound and safe technology, that will come soon enough.

In closing, do you have a message for blockchain and cryptology researchers or followers of the community?

I think blockchain is a very interesting area. Many different people with different backgrounds are coming together for all kinds of purposes, creating new visions and results. Some face difficulty following this because it’s such a mix of different expertise and waste a lot of time on this. To have some leverage in this field doesn’t just mean that you’re on top of pre-existing technology, but also to have something new of your own. I think it would be great if people in the blockchain industry focused not only on learning existing technology but also on creating new technology, different research, visions, and ideas.

[Hashed Community]

Hashed Website: hashed.com

Medium: medium.com/hashed-official

twitter: twitter.com/hashed_official

Telegram: t.me/hashedchannel

--

--

HASHED
Hashed Team Blog

To empower networks and innovators in building the decentralized future.