[2018.3.16] Bluzelle Telegram Live Summary ft. Michael Egorov

Discussion on security design of Bluzelle DB

Bluzelle

Published in

The Blueprint by Bluzelle

6 min readMar 16, 2018

This week we are honoured to have Michael Egorov in the group.

About Michael Egorov

Michael is an advisor to Bluzelle as well as CTO of Nucypher. He advises Bluzelle on the security design of our protocol.

NuCypher is an encryption layer for Big Data (Hadoop, Kafka, Spark). It uses proxy re-encryption to having no single point of security failure in your Big Data infrastructure.

Prior to founding NuCypher in 2016, Michael worked on infrastructure tools at LinkedIn. He holds PhD in atomic physics (University of Swinburne, Australia), graduated Moscow Institute of Physics and Technology.

Discussion Summary

Q: How does Bluzelle secure data at rest, and in transit? & How does your experience at NuCypher help Bluzelle with that?

A: Basically, there are symmetric and public key encryption. But those by themselves don’t give ability to search and share encrypted data.

What we’re working on in NuCypher is for data sharing. Search is more like what we worked on when we’ve been called “ZeroDB”. By search over encrypted data I mean searching without decrypting, and sharing means giving ability to read the data to some other keypair without anyone else decrypting the data.

Due to the decentralized nature of Bluzelle, even a simple approach for search like what MIT did in CryptDB is suitable and secure (when multiple swarms are used). For sharing, NuCypher network and proxy re-encryption are suitable, especially so for large binary objects.

Q: Apart from an advisory role, is there a long-term relationship between Bluzelle and NuCypher and what does that relationship entail?

A: Basically, two things. One is that sharing large encrypted values in stored Bluzelle is especially convenient with NuCypher’s network and the other is that NuCypher network may need a good public key-value store. So, the relationship is mutually beneficial

Q: Could you give a summary of what NuCypher does?

A: We use proxy re-encryption for sharing data while encrypted, and our network decentralizes it, so we don’t trust any particular node to do it, but rather multiple of those and this is interfaced as an [encryption] key management system as a decentralized service

Q: Regarding data protection, are there any risks to user data?

A: Well, of course it’s important to know what the attack vectors are, and what the attacker should do to game the system (usually, the possible attacks should be pointed out when the implementation is planned)

Q: Are there any trade-offs between security and speed in Bluzelle?

A: Yes, security comes at a price, especially when you work with multiple users. Overhead of searchable encryption if done as in CryptDB should probably be small, but shareable encryption is trickier (could be as much as almost a millisecond per object). There are ways to work around that at the expense of how granular the sharing could be.

Q: How well do you think Bluzelle has managed to balance this trade-off between security and speed?

Answer: Well, right now Bluzelle works with plaintext data, and encryption comes as the next step. I would say, signing should be done before it (and that’s extremely important!) it should also be carefully designed to be performant (e.g. not necessarily you want to sign every single row individually). In fact, signing and shareable encryption are about the same speed.

What does signing mean? Basically, it is producing proof that you were the one who changed the data. For example, every time you send Ether or BitCoin, you sign a transaction. Without it, it’s more or less “everyone can overwrite”, which limits use cases very much. Hence, signing is a huge improvement!

Q: So how do you control the index of searchable cryptographic hashes and maintain the security of the data indexed?

A: Making an encrypted index is tricky. The simplest approach is this: https://css.csail.mit.edu/cryptdb/, but it suffers from statistical attack.

The good news is that if you split the index into many swarms, it becomes less and less vulnerable to those. That’s probably the most performant way. If one has a single server, searchable encryption is really slow (although it improves every year!)

Q: My concern is how do I, as a Bluzelle token holder, control the access and indexing of my data contained in there?

A: You own your encryption and signing keys, so I think you will be in control what’s shared with who, and what’s indexed and whatnot.

Q: Concern about General Data Protection Regulation and managing user data on decentralized databases: Would the current encryption mentioned above suffice to meet these regulations, especially since it is decentralized storing now? What are your views on it?

A: I think yes, or at least it would be a very big step towards compliance with GDPR. It covers GDPR and in many cases even data sovereignty laws (although, the latter — not for all countries). I have info that Microsoft is planning to spend several billion dollars a year in fines because they cannot comply with GDPR currently. So yes, big problem. And I think using encrypted services is the best way to solve it

Q: How would the Bluzelle distributed storage deal with unlawful contents? Will they be able to delete them?

A: Well. If the content is end-to-end encrypted, there is no way to even figure out if it is unlawful and decentralization suggests that no way to take it down as well. It is possible that you can be required to pre-share data with someone (regulator?) while keeping data end-to-end encrypted. But you have to explicitly share it with the regulator in this case, in advance. No one can see the data if you don’t allow it in advance. Possibly same with takedowns. At the end of the day, encryption and decentralization are just tools. And whatever app developers build with that is up to them. I can certainly imagine some “compliant apps for financial industry” which still don’t have backdoors but just have good access controls (which are agreed upon by financial institutions). But certainly, it is no place for covert surveillance and stuff like that.

Q: So if there are some centralization, for instance, there was a company that is investigated for possible unlawful content. Would it be possible for authorities to actually request for removal of the data? e.g. they have information that data is being stored in a decentralized database from the laptops seized.

A: I think again, it’s up to app developers to provide tools for that, and the user of the app should agree that some guy over there can remove his content. Perhaps, not appropriate for low level tools to be able to facilitate that. If there were two decentralized DBs — one which can take down my data, and the other which cannot, I would certainly use the second

Q: What made you see the potential in Bluzelle and agree to be an advisor?

A: We started as an end-to-end encrypted database ourselves, so I know it’s an important piece and the first thing I thought back then was actually a decentralized database. It’s pretty clear that such a piece of infrastructure is required for the future decentralized internet: we’re in the early days of the new internet. I actually believe that it’s way more than that. Our civilization moves towards the new, more free society, where freedom of data and freedom of monetary transactions is something inevitable.

Q: Do you see any legal issues preventing big companies from adoption of a decentralized database due to the structure of it? Or blockchain in general?

A: Some companies require to be able to inspect physical servers where the data is stored (banks). Over there it probably would have issues. It goes the same for blockchain. That’s why they’re so much into “private blockchains”.

Q: What is the key difference between IPFS and SwarmDB?

A: I think, the main difference here is the speed. Also, IIRC SwarmDB should be way heavier on the client side. IPFS is not intended to be a DB though, rather the file storage. I guess it’s like a distributed hash table, it’s just slow-ish, and high speed wasn’t the purpose. Furthermore, no query ability was intended to be made.

To Get Started with Bluzelle

Get-Started Guide|Website| Whitepaper(English)

Never Miss An Update By Following Bluzelle’s Channels