How the Storm4 smart contract works

Storm4 is a new crypto cloud storage app. It makes it easy to send & receive files securely.

And when I say securely, I mean real security. As in, when Alice sends a file to Bob, there’s nobody else in the entire world who can read the file except Bob. Not even the engineers who maintain the backend servers. Not even the development team who writes the code. Nobody. Nada.

It achieves this using public key cryptography. Every Storm4 user has a public key & private key. The public key is, well, public. And the private key is known only by the user. If data is encrypted using the public key, then ONLY the private key can be used to decrypt it.

That’s all well and good. And it’s the same technology you rely on to send credit card information over the Internet. The big question is: How does Alice get Bob’s public key in the first place?

Storm4 makes it easy to search for users within the system, and makes it easy to download their public key. But WAIT! There’s a well-known problem in computer science & cryptography called the man-in-the-middle attack.

Here’s how the evil man-in-the-middle (MITM) might try to eavesdrop on that document from Alice to Bob:

  • The MITM hacks our server (or beats one of our engineers with a wrench…)
  • He then replaces Bob’s public key with his own public key
  • Alice downloads the fake public key thinking it’s Bob’s
  • She uses this to perform the encryption & then uploads the encrypted file to the cloud
  • The MITM can now decrypt and read the document
  • The MITM gets bonus points if he re-encrypts the document with the real public key before Bob accesses it. Because then Alice and Bob don’t realize they’ve been tricked!

And this is why Storm4 uses a smart contract — to thwart the man-in-the-middle attack.

Ethereum Blockchain

Ethereum is a decentralized platform that runs “smart contracts”. And the term “smart contract” is just a fancy way of saying “small computer program”. However, there is a reason they use the term “contract” (beyond brilliant marketing). And that’s because an application deployed to the Ethereum network can never be modified in any way.

That is to say, you cannot ever change the code of a deployed app. You can call functions in the app that might change the data that’s stored within the app. But the code itself is immutable.

And it’s this immutability which makes it the perfect tool for thwarting that evil man-in-the-middle attack.

Problem -> Solution

Alice downloads Bob’s public key from the Storm4 servers. But she needs a way to verify the authenticity of the key. This should be done using an independent trusted 3rd party.

With Storm4, the trusted 3rd party is the Ethereum Blockchain. Here’s the 10,000 foot overview:

  • There’s a smart contract deployed to the Ethereum blockchain
  • This contract allows a user’s public key information to be set once (and only once)
  • Bob’s public key information has been set within the smart contract, and is verifiable by anyone
  • Alice can query the smart contract to get all the information she needs to verify the authenticity of Bob’s public key
  • The Storm4 app does this automatically for every user Alice interacts with

Technical Details

(The rest of the article is primarily for technical readers.)

The contract code is short (a couple dozen lines), and fairly easy to read. You can read it here.

There were 2 requirements for it:

  1. Ensure the code allows a user’s public key information to be set once, and only once.
  2. Ensure the public key information stored on the blockchain can somehow be used to cryptographically verify the authenticity of a user’s public key.

The first requirement was a no-brainer. In pseudo-code:

if (users[userID] == null) {
users[userID] = publicKeyInfoGoesHere
}

The second requirement was a little trickier. Not because of the crypto, but because of the unique restrictions of smart contracts.

Our first thought was to do this:

if (users[userID] == null) {
users[userID] = <The entire friggin public key>
}

It turns out this is a pretty dumb idea. Public keys in our system are 105 bytes. Which requires 4 SSTORE ops in the EVM (ethereum virtual machine). Which is so expensive on a per user basis (20,000 gas per SSTORE op) that it’s just wasteful.

So our second thought was to do this:

if (users[userID] == null) {
users[userID] = <a hash of the public key>
}

This was much better. But, from a usability perspective, we wanted something more. We also wanted to store the blockNumber of when Bob’s public key was set. This would allow Alice to get a timestamp of when Bob’s public key was made auditable on the blockchain.

Fast-forward a few versions while we struggled to minimize smart contract costs, and we ended up with the following architecture:

  • We batch new users into a pool for publication onto the blockchain.
  • We take all the users & their public keys, and create a merkle tree with all the data. (details below)
  • We publish the merkle tree root to the blockchain for all the user’s in the pool.

Let’s break this down by walking through the process of verifying a real user’s public key.

Step 1 — Query the smart contract

Using Etherscan you can interact with the smart contract from their webpage! Just go here, and you’ll see a list of functions such as getBlockNumber, getUserInfo, etc.

Find the function named getMerkleTreeRoot. You'll see that it takes a single parameter named userID which is of type bytes20.

All userID’s in Storm4 are 160 bits (randomly generated). They are always displayed in zBase32, and are thus rendered as 32 characters. For our example, we’ll be verifying the public key for userID: dpb6rdqdmiw5q9fawycrokrwrqfiq5kp

As you probably noticed, 160 bits == 20 bytes. But Ethereum wants all values in hexadecimal. So in order to invoke the getMerkleTreeRoot function, we need convert from zBase32 to hexadecimal. Here's one way to do so using Node.js:

$ npm install zbase32
$ node
> const zbase32 = require('zbase32')
> Buffer.from(zbase32.decode('dpb6rdqdmiw5q9fawycrokrwrqfiq5kp')).toString('hex')
'1b43e20dc35d69b77cb8a018482894238b576d4d'

Now copy-n-paste the value 1b43e20dc35d69b77cb8a018482894238b576d4d into the userID parameter field for the function getMerkleTreeRoot, and click the 'Query' button. You should see the following output:

[ getMerkleTreeRoot method Response ]
bytes32 : 0xcd59b7bda6dc1dd82cb173d0cdfa408db30e9a747d4366eb5b60597899eb69c1

This value (cd59b7…9c1) represents the merkle tree root value.

You can also get this value programmatically using HTTPS:

curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_call","id":1,"params":[{"to":"0x997715D0eb47A50D7521ed0D2D023624a4333F9A","data":"0xee94c7971b43e20dc35d69b77cb8a018482894238b576d4d000000000000000000000000"},"latest"]}' https://mainnet.infura.io/94cbbe9f44574c19af2335390473a778

The ‘data’ section structure is:

  • first 4 bytes (8 hex chars) : function signature
  • next 20 bytes (40 hex chars) : userID (in hex, not zBase32)
  • next 12 bytes (24 hex chars) : zero (because ethereum expects 32 bytes per parameter)

So you can swap in any userID just by changing that section of the data.

Step 2 — The merkle tree file

Once you have the merkle tree root value from the blockchain, you can fetch the full merkle tree file from our servers using the format:

https://blockchain.storm4.cloud/<merkleTreeRootValue>.json

Here’s the download: cd59b7bda6dc1dd82cb173d0cdfa408db30e9a747d4366eb5b60597899eb69c1.json

This JSON file is an object with 3 top-level keys:

  • “merkle” : The merkle tree (in a flattened form, details below).
  • “values” : The raw values used to make the merkle file.
  • “lookup” : Maps from userID to associated public key info within the values array.

To verify our user we start by verifying the public key information:

{
"merkle": {...}
"values": [
"{\"userID\":\"dpb6rdqdmiw5q9fawycrokrwrqfiq5kp\",\"pubKey\":\"BBOWJpL+t9ya8AVIV6mpymv8pXSvy2JC9aWutYPPrDoo7+YtF+LpKyYCAQb13DsfeGQ6aVodlAiZ4XZPHlSoFiuzjcBcT23sNEh4vsTfjLu2Si1qGnsY+2qhlJH5ffakm380tvKKBsgA\",\"keyID\":\"loKQlyqSK8rQq7RYhvuh1Q==\"}",
...
],
"lookup": {
"dpb6rdqdmiw5q9fawycrokrwrqfiq5kp": 0,
...
}
}

So the user’s public key info can be looked up via:

index = json.lookup["dpb6rdqdmiw5q9fawycrokrwrqfiq5kp"]
value = json.values[index]

Programmers will note that this is a string, whose value is a serialized JSON object:

{\"userID\":\"dpb6rdqdmiw5q9fawycrokrwrqfiq5kp\",\"pubKey\":\"BBOWJpL+t9ya8AVIV6mpymv8pXSvy2JC9aWutYPPrDoo7+YtF+LpKyYCAQb13DsfeGQ6aVodlAiZ4XZPHlSoFiuzjcBcT23sNEh4vsTfjLu2Si1qGnsY+2qhlJH5ffakm380tvKKBsgA\",\"keyID\":\"loKQlyqSK8rQq7RYhvuh1Q==\"}

This contains the full public key of the user. So verification is straight-forward. If everything checks out, our next step is to verify the merkle tree itself.

That is, we’re going to verify that the MITM didn’t hack both Bob’s public key & the merkle tree file. This is easy because merkle tree files are self-signing.

Step 3 — Merkle tree file verification

A merkle tree is created as follows:

  • first hash all the input values
  • use the hashes as the leaves of the tree
  • recursively hash the leaves together in groups of 2
  • continue until there is only 1 value left => the “root”

In this example JSON file there are 3 values. So to get the 3 leaf nodes of the tree, we need to hash the 3 values. We’ll call these leafs A, B & C.

The next step in the merkle tree is:

  • Hash(A, B) = D
  • Hash(C, C) = E

And finally:

  • Hash(D, E) = Root
(Root)
/ \
(D) (E)
/ \ |
(A) (B) (C)

Let’s walk through this one step at a time.

We start by hashing (sha256) all the values. That is, we want to hash each string in the values array. Here's an example of how to hash the first value on the command line:

$ echo -n "{\"userID\":\"dpb6rdqdmiw5q9fawycrokrwrqfiq5kp\",\"pubKey\":\"BBOWJpL+t9ya8AVIV6mpymv8pXSvy2JC9aWutYPPrDoo7+YtF+LpKyYCAQb13DsfeGQ6aVodlAiZ4XZPHlSoFiuzjcBcT23sNEh4vsTfjLu2Si1qGnsY+2qhlJH5ffakm380tvKKBsgA\",\"keyID\":\"loKQlyqSK8rQq7RYhvuh1Q==\"}" | openssl dgst -sha256
4a6ceaf3f814800451dd3b907bc1a0a27503552615be3ed5b5f040df7f4e0c98

So this value (4a6ceaf…0c98) is one of the leaf nodes in the merkle tree. Specifically, it’s the value we were calling A.

You can follow this process in the JSON. It’s pretty straight-forward. We can start by finding the leaf we calculated above:

"merkle": {
...
"4a6ceaf3f814800451dd3b907bc1a0a27503552615be3ed5b5f040df7f4e0c98": {
"type": "leaf",
"level": 0,
"left": "data",
"right": "data",
"parent": "174d1a20dd791e36cba6e4c5ce3933e4bfeeb894c0d77673c2dab6405332b468"
},
...
}

And HASH(A, B) = D is here:

"174d1a20dd791e36cba6e4c5ce3933e4bfeeb894c0d77673c2dab6405332b468": {
"type": "node",
"level": 1,
"left": "4a6ceaf3f814800451dd3b907bc1a0a27503552615be3ed5b5f040df7f4e0c98",
"right": "724b0761a18362ead9b48ae1da67a5f6e1580db546871f6e6527d5294adb1d91",
"parent": "cd59b7bda6dc1dd82cb173d0cdfa408db30e9a747d4366eb5b60597899eb69c1"
}

Eventually we get to the merkle tree root value: cd59b7bda6dc1dd82cb173d0cdfa408db30e9a747d4366eb5b60597899eb69c1

Does this match the blockchain value? If so, you’ve just verified the user’s public key. If not, there’s an evil man-in-the-middle attack happening. Abort !!!