How We Designed File Request Links

Published in

Tresorit Engineering

8 min readApr 27, 2020

File Request Links is a new feature we implemented which allows users to receive files from one-time collaborators. This was also an interesting engineering challenge that I’d like to explore in this article, particularly because it showcases many of the basic building blocks of a secure sharing process. First I’ll discuss what the feature is and then I’ll demonstrate the cryptographic design process behind it in small, iterative steps.

The Feature

The basic idea is that we already let users send files to anyone. Using our share links, users can create links to content they have uploaded to Tresorit, and send them to people they want to grant access to with optional password protection.

File Request Links work similarly; you create a link that you send to people you want to receive files from. You can also request email verification in case you want to receive uploads from multiple people and you want to connect each file to the uploader (or their email addresses at least).

Example use cases:

Law firms requesting confidential data
Inbox for bids on open contracts
“Dead drop” for journalists receiving data
HR receiving CVs for open job offers

The Requirements

When engineering a feature like this, it’s very important to have your requirements laid out before you actually start planning any details, to ensure it will fit in with the overall product and that it will give users the same guarantees your software does everywhere else.

Our servers should not be able to decrypt the data.
The uploader should be sure that the data is only accessible to the link sender.

The receiver should be able to verify that the uploader’s email address or the data wasn’t changed by the server.
The server should not be able to “fake” uploads.

The receiver should be able to move uploads into their cloud without downloading and re-encrypting it.
The server should be able to verify the email address if the receiver requires it.

Some of the above might seem strange, but there is an important guarantee we want to provide to our users: even if someone were to try and force us to give up encryption keys, we are not able to — we don’t have access to them. This also makes things harder of course, but it’s what sets us apart, as most other solutions don’t take this into account.

The Implementation

The above requirements make this an interesting challenge; since we don’t want to trust the server with controlling access, we need to ensure these things cryptographically. I’ll try and explain how we came up with a solution that solves all of the above and give a brief explanation about the crypto behind each step.

Data format

Being able to move it into a regular folder without re-encryption means that we have to use the same data format as we do for regular files. This means that the encrypted file content and the metadata about them will be uploaded separately: file content and hashes will be uploaded first each encrypted by a unique key, then all the filenames and keys follow separately in another encrypted structure we call ‘directory’. We will only talk about encrypting and uploading the directory and even that we’ll handle as an arbitrary blob of data because the formats are pre-determined.

Iteration 1 — Symmetric key in the URL

So we have our blob of data and we need to encrypt it somehow. Let’s start simple, add a key into the URL, and encrypt the data using symmetric crypto. We can actually add a key into the URL without it being sent over the network: anything after the # (what we call the hash of the URL) is not sent to the servers, so it won’t appear in firewall logs either.

Symmetric encryption means that you use the same key for encryption as you use for decryption (hence the symmetry). An example of this is AES-256, which is what most ads refer to as “military-grade encryption”. This means, that the keys you use here are like normal keys: anyone who has the key can unlock anything that was locked by it. Your lock can be super secure (as AES-256 is), but if you hand your key over to someone they can easily unlock it. If you uploaded something, then you’d have to have the key — meaning you can decrypt data uploaded by someone else.

Iteration 2 — Asymmetric crypto

Since symmetric encryption has this problem, we could try asymmetric instead. It basically means that you have a set of two keys: anything locked (or encrypted) by one can be unlocked (or decrypted) by the other. This seems like an exact fit for our purposes, as the receiver can keep the private key, and we can give the uploaders the public one. These public keys are pretty long though, so we only add an id to the hash, and the uploader can get the key from our servers.

Asymmetric encryption is slower, but there is an even bigger problem: it has a size limit. 2048-bit RSA keys can only encrypt a meager 126 bytes (with OAEP-SHA-512) in one go. This means that we have to generate a symmetric key and use it just like the previous iteration, then encrypt it using the public key. This is actually just a layer on top of Iteration 1.

This seems secure if you only consider the uploaders and outside threats, but remember: we still want the data to be secure even if our servers are taken over. If they are, whoever has control could switch out the public key and steal any future uploads. There are multiple ways to get around this, but we’re still missing a few other things, namely, the server can’t check the uploader’s identity and it could fake uploads.

Iteration 3 — Key derivation

We could combine the two methods for a bunch of added benefits. This time we could add a relatively short (compared to public keys, anyways) random string to the URL and derive multiple things from it. The link creator can tell the server about parts of this information and use the other part to send information to the uploader without the server seeing it.

This is called key derivation: we take a bunch of data and turn it into another bunch of data in a way that can’t be reversed (or at least is very hard to do), e.g.: hashing it. It’s like a fire pole: you can slide down multiple times and get to the same place (same data in this case), but you can’t go back up.

From the random data in the URL, we derive 2 pieces of information: id and key. We can use the id to get both the public key and an encrypted shared secret from the server. We can decrypt this shared secret using the key derived from the URL and use it to encrypt the data produced the same way as above. This makes it another layer on top of the above.

After all this, it starts to feel like an onion, but it’s worth it: we have a short URL, yet the uploaded data is protected from other uploaders (because of the asymmetric part) and the server can verify that the uploader has the URL but can’t change or steal uploads. There is one last thing still missing though: there is no information about the uploader the receiver could check. If we just upload the metadata to the server, it could still change the uploader’s email address (even if the data is the same) and if it’s encrypted there is no way to send an email to verify the uploader.

Iteration 4 — Authenticated Data

We need to add metadata to the uploads that is both verifiable by the uploader and the server. Luckily, we are using AES-GCM, which allows us to add some associated data to what’s encrypted. It’s like a glass compartment on a locked box: some parts are visible from the outside, but no one can tamper with it. We can upload the email address and the link id in this “glass compartment”, so the server can check both and send a verification email, while the receiver can verify that the server didn’t change them.

Verifying the requirements

Now that we have a plan, we need to verify that we do actually meet all our requirements.

Our servers should not be able to decrypt the data.
It would need access to both the private key of the link and the shared secret, but it has access to neither.

The uploader should be sure that the data is only accessible to the link sender.
The uploader is ensured of this by encrypting the key to the uploads using a key from the URL and the public key of the destination. This means that it’s only decryptable by someone who has access to both the URL and the private key of the recipient, both held by the receiver.

The receiver should be able to verify that the uploader’s email address or the data wasn’t changed by the server.
The receiver can verify that nobody tampered with the authenticated data part of the upload, which contains the information about the uploader.

The server should not be able to “fake” uploads.
The server can’t “fake” uploads, because it doesn’t have access to the shared secret: it’s encrypted by a key derived from the URL, but that isn’t sent to the server.

The receiver should be able to move uploads into their cloud without downloading and re-encrypting it.
We have treated the data as an arbitrary blob of data, so we can use our normal file formats and it was encrypted by a unique key so we can simply reuse it.

The server should be able to check the email address if the receiver requires it.
The server can check the email address of the uploader because it’s uploaded as authenticated data: something the receiver can verify but the server can see.

Everything checks out, meaning you can get data uploaded (almost) straight into your cloud storage safe from everyone, including anyone hacking into our servers (if that ever happens).

Conclusion

This means, that our design is done and it does what we originally intended it to do: allow users to request files from non-registered collaborators in a secure manner. The implementation of this design is also a challenge: the limitations and the finer details took months for multiple people to iron out.

All in all, this was a fun and challenging project to work on and I hope I can use it as an example to show the basic building blocks of crypto design and our design process.