A new internet, with no one in control
“The original idea of the web was that it should be a collaborative space where you can communicate through sharing information.” Tim-Berners Lee
Last Wednesday, as it happens every now and then, I was having a cup of coffee with a friend of mine and the conversation turned to privacy. “You can’t expect not to be tracked,’’ Dave said (FYI: Dave is a made-up name).
“I don’t like it, though” I confessed.
“Neither do I. But you know what? Who cares? I got nothing to hide.”
“I don’t know, Dave, it doesn’t sound quite right. I got nothing to hide either, but would you put cameras in your bedroom?”
“What?”
“My point is,” I told him, sipping the coffee as I was trying to organize my thoughts. “Would you like to have cameras in your bedroom streaming to other people?”
“Of course I wouldn’t. That is so messed up.”
“Then I guess you wouldn’t put those kinds of cameras in your living room either, would you?”
“Hell no. But I don’t follow you. Who on earth would?”
“The thing is, if you find it disgusting, how come you are ok with giving up your data so easily?”
“You mean Google, Facebook, and stuff?”
“Yeah, exactly.”
“That’s not the same thing actually.” replied Dave, “On those platforms I am in control of my data.”
“Are you really in control, Dave?”
“It’s nothing like having cameras in my living room.”
“Oh, they might not have the footage of your living room, but they know everything about you, even more than you know about yourself. Your geographical position at any given moment. Your hobbies. Your circle of friends. Your political beliefs. Your taste in music, food, partners. The book you read, the movies you watch. What you chat about, word by word. They know the minute you go to sleep and when you wake up. Where you live. The identity of all the people in your photos, even of those who didn’t give permission to profile them. You’re right Dave, it’s nothing like cameras in your living room. It’s worse. Way worse.”
“But that’s the way the world goes, isn’t it? Look: they are free services. I like it when it’s free.”
“Free as long as you pay with your data.”
“True. I pay with my data.” Dave repeated, as to emphatize what was followed. “After all,” Dave smiled, “I wouldn’t pay real money for them, you know what I mean?”
“What if you didn’t have to?”
“I beg your pardon?”
“I mean: what if there were another way for you to pay with something other than your data — something that respects your privacy.”
“They respect your privacy, they say. They talk the talk but when it comes to walking the walk, nothing, zip, nada. The thing is, the world is a bad place and when it comes to this kind of stuff, there’s no one you can trust.”
“What if I told you there’s a way for you to keep your data safe without you having to trust anyone?”
“Like a hard disk?”
“Not like a hard disk, I mean online. How do you access your stuff when your hard disk is at home? You always have it with you?.”
“So, like a NAS,” Dave said, not listening to my questions. “Paula told me about it the other day. It’s like a hard disk connected to the internet.”
“A NAS is a thing. Things get broken. Plus, if the power goes out the NAS goes offline and you’re back at square one.”
“So what? You talking about software, aren’t you.”
“Indeed.”
“Then I would tell you such a thing does not exist.”
There it was the bait that I was waiting for, served to me on a silver plate. “There it is, Dave,” I rejoiced. “And I know it for a fact because I work for the company who’s building it.”
“Oh, if I know you well enough it must be some complicated mess.”
“On the contrary, it is quite simple. Let me explain it to you. Are you familiar with the concept of encryption?”
“See? I told you it’s a complicated mess.”
“Ok, cryptography 101: how do you make sure the message is delivered without anyone overhearing it?”
“I don’t know, but something tells me I’m about to learn it.”
“You remember when we were used to sharing secrets through a cypher in elementary school? Such as, A becomes B, B becomes C, so to mean ABBA you write BCCB.”
“Yeah, kind of.”
“That’s called Caesar cypher. The thing is, if you encrypt your message with a Caesar cypher an eavesdropper can easily understand the message with little effort by guessing the encryption key. Also, both you and your interlocutor must know the encryption key beforehand. That’s the whole point: the encryption key is secret, as it is the decryption key as well.”
“You don’t make it sound great.”
“It is not! Today we use something called Advanced Encryption Standard.”
“That is?”
“It’s sort of a super complicated Caesar cypher, a crypto sudoku you might say. One so hard to figure out that it’s uncrackable for today’s computers.”
“Cool,” said Dave while drinking his coffee, head in the clouds. “But I still don’t get it what it has to do with this company you work for. By the way, what is it about?”
“Cloud storage.”
“You mean… like Dropbox?”
“Exactly like Dropbox. But here’s the catch: most other cloud providers don’t encrypt your data at all. Some do but keep the keys on their servers.”
“Why?”
“It’s easier that way. Not to mention that by doing so they can mine the data of millions of people to sell aggregated information to ad companies. And data is pretty much another word for money nowadays.”
“So how do you do it?”
“Do what?”
“I mean, how do you solve this problem? I got to hold the key while your data center stores my files without knowing what is inside it, correct?”
“Pretty much. With a significant difference: we don’t have a data center.”
“Then where are my data?”
“Oh, in other people’s servers. The point is these servers are not in some remote server farm, but in the users’ hands.”
“So, strangers will be able to see my private photos. What are you, nuts?”
“Easy, easy,” I smiled, “no one can see your files. Let me explain,” I said, grabbing a tissue and a pencil from my pocket. “So, these servers I’m talking about, they’re called Cubbit Cells. They are the nodes of the network.
On the top of this network runs an AI.”
“Like i-Robot?”
“Nothing like i-Robot.”
“Oh.”
“It’s just a bunch of machine learning algorithms that, much like an orchestra conductor, optimize the network. Anyway, when you upload a file to your Cubbit Cell, it gets encrypted and split into chunks. Then, each chunk is sent to a different Cubbit Cell via end-to-end encrypted channels.”
“Wait wait wait… where do you store the encryption key?”
“On the AI coordinator, but it is encrypted with a master key, which is derived from your password.”
“So, if I lose my device I am not screwed. Right?”
“Right.”
“And what about the master key? Where is it stored?”
“Nowhere. You see, the master key is generated on the fly through a non-reversible function by your computer each time you insert your password. In other words, it’s ephemeral.”
“And what if I shared my password with someone else? Would that put me in trouble?”
“A lot of trouble! The whole point here is not to share your password with anyone else. One of the peculiarities of Cubbit is that we don’t know anyone’s password: as long as you don’t share your password, you’re watertight! This is what we call zero-knowledge cryptography: we don’t know, we don’t forget.”
“Forget?”
“We don’t forget anyone’s data.”
“Oh, nice. So, no one else can decrypt my data, right? What about those big supercomputers we heard about on the tv? Can’t they decrypt my password?”
“It would take longer than the age of the universe.” His face was the portrait of wonder. “You don’t have to trust me on this, Dave: it’s math. The AES protocol we use is the same used by the US government to protect top secret data”
“Go on.”
“So, when you remote-access your file on Cubbit, what you are actually doing is asking the AI coordinator where your chunks are.”
“Ah-ah! I got you, that’s the catch! The AI coordinator knows it all!”
“Well, actually no,” I reassured him. “The AI coordinator only knows where each chunk is located in the network, but doesn’t know what that chunk is all about. Remember: only you have the encryption key and no one else.”
“Right,” Dave nodded. “It seems you guys really thought it out.”
“Then,” I replied, “as soon as you get to know where your chunks are hosted, you connect to the corresponding Cells through peer-to-peer end-to-end encrypted channels. That’s it. You download the chunks, reassemble the file, and decrypt it with the key that is retrievable to you and you alone.”
“So, if I am not mistaken, as soon as, what did you call it? Cubbit Cell? As soon as one Cubbit Cell goes offline, my files go offline as well.”
“Yeah, that would be right if the system worked as I told you,” I blushed. “The thing is, I forgot to tell you about the redundancy.”
“I knew it! Something was out of whack. So, what’s this redundancy thing?”
“Basically, Cubbit implements a redundancy procedure based on Reed Solomon error-correcting codes.”
“It’s a made-up name, right?”
“Not at all. Listen: as I told you before, each encrypted file is split into chunks. Then, it’s processed into additional redundancy shards. For example, let’s say we use a proportion of 24 chunks and 12 additional shards. Out of 36 shards, only 24 of them are necessary to retrieve the original encrypted file.”
“What do you mean by 24?” Dave frowned. “Do I need the original 24 chunks?”
At this point I was thrilled as a novelist finally coming to the point of writing the twist of his book, when you discover that nothing is as it seems. “Here comes the magic, Dave, hear me out: any 24 chunks will do the job.”
“How is that even possible?”
“It’s based on Galois representations, but don’t you worry, I’ll use a metaphor,” I said, grabbing the pen. “Imagine your file as a straight line.
How many points do you need to describe its inclination?”
“Two.”
“Yet, the line is made of an infinite number of points. Weird, uh? The concept is pretty much the same. You just need a certain number of shards to describe a file and reassemble it. In other words, no single Cell is necessary to recompose the file. In the previous example, for instance, you just need 24 random Cells out of the set of 36 to be online.”
“Ok, but, frankly, to know I depend on random people is not reassuring. What is the probability of downtime?”
“One in a million, even being pessimistic. Moreover, the AI monitors the uptime status of each Cell and triggers a recovery procedure when the total number of online shards reaches a certain threshold.”
“All in all, it looks cool, but I still don’t grasp why a NAS is not enough.”
“It is enough. For you. For a limited time window. But what this company is trying to build is way greater. At the end of the day, what this is really about is a new internet.”
Dave’s forehead puckered. I could almost see the questions buzzing through his head.
“At this stage, Cubbit is a cloud storage platform,” I explained, “but this is just now, don’t focus on the now. When the network is big enough you will be able to run applications on top of Cubbit. Unstoppable applications, as the AI coordinator will be distributed as well. And you won’t need a Cubbit Cell: Cubbit will run on your computer, mobile phone, whatever. In other words…”
“Yeah,” I said, finishing my coffee. “A new internet, with no one in control.”