How Internet Computer Responses Are Certified as Authentic

DFINITY
The Internet Computer Review
18 min readDec 29, 2021

A technical explanation of how responses from canister smart contracts running on the Internet Computer can be certified as true.

By Joachim Breitner

As a user interacting with the Internet Computer blockchain, how can you trust the responses that you get? In other words, how can you confirm that you are really talking with the Internet Computer?

Let’s start with how it would look like in an idealized world. Imagine that you are interacting with a service that is running completely in a trusted environment (e.g., your own computer or your company’s network). For the purpose of this blog post, the service we’re looking at might be a booking agency for show tickets.

So you interact with that service and book a ticket for an upcoming show. As long as the service and network are trusted, that’s all you need to feel secure and confident that you have really booked seat 5A for the show. In reality, you are likely talking to a service over the Internet, which is a big, dangerous place full of malicious parties. Now there is a problem: What if someone between you and the service you’re talking to wants to manipulate your request and the corresponding response? They could give an incorrect response, such as the wrong seat, or inform you that you didn’t actually book a ticket when you did. Conversely, they could tell you that you purchased a ticket when you actually did not.

That is exactly the kind of problem being addressed here. On the internet, the solution is known, and it’s based on public key cryptography. The service has a secret key. This is the blue key on the right in the above figure. And when it responds to your request, it uses the secret key to sign that response. The signature is represented by the blue ribbon in the above figure.

As a user, you know the corresponding public key of the service. Using that public key, you can verify the signature on the response. You can check that the response indeed comes from that service, and know that you really have booked the ticket. This is the conventional experience when you use a service over the internet.

When it comes to the Internet Computer, things are similar but slightly different. The service you are interacting with isn’t a full-stack service server on its own with its own secret keys, but rather just code running in canister smart contracts (i.e., an application that runs on the Internet Computer), and it doesn’t do cryptography on its own. Instead it’s being executed by what is called a subnet, which is a collection of nodes that jointly run the canister, which, in this example, is the booking agency.

The subnet as a whole can create a signature, like the one we’ve seen earlier. Even though there are multiple nodes that make up the subnet and some of them may be malicious, it can only create a signature if a supermajority of the nodes agree on what to sign. This means that even if a few of the nodes are malicious, they can’t create false signatures. Therefore, when receiving a signed response from the Internet Computer with a valid signature from the subnet, the user knows that this is the agreed response, even if some of the nodes are malicious.

These signatures are called threshold signatures. What’s interesting here is that these signatures are somewhat expensive to create. The nodes need to collaborate and communicate in order to create one of these threshold signatures. Therefore, it wouldn’t scale to sign each individual response when there could be several thousands of requests per second.

To solve that problem, responses are not signed individually. What the network does is record all of the responses in what will be called, for the purpose of this blog post, a “document” of the subnet that lists all the responses.

As you can see here, somewhere among these responses, it is noted that when Kim, our user in this example, asks the service to book a ticket on a show, Kim’s booking was successful and the seat is 5A. There are also responses to Larry and other users. What could theoretically be done is this whole “document,” which has a signature from this threshold signature key by the subnet, can be taken and shipped to the user. The user can verify that their response is in that document. This would not be ideal because all of the responses are recorded in this document, so it is going to be large. Moreover, Kim doesn’t really have any business seeing the responses to requests made by Larry to a completely unrelated service.

What can be done is the document can be redacted, removing parts that Kim shouldn’t see, and only the portions that are relevant to Kim can be included. The signature will still remain valid. This means that Kim can now look at this document and verify that the response to her request was that she got a ticket for seat 5A and that the signature is correct. The redaction hides information from Kim that she shouldn’t see, making the document smaller. The effect is similar in principle to document compression or image compression. So this way Kim gets a reasonably sized response that only conveys to Kim what Kim needs to know.

But there is a wrinkle that needs to be considered. This subnet that is being talked to is called Subnet 5, which means that there is more than one subnet. Indeed, the Internet Computer is an infinitely scalable blockchain with a constantly growing number of subnets. In the above image, Kim knows the public key corresponding to Subnet 5. With so many subnets, it wouldn’t be ideal if all of the users needed all of the public keys that correspond to all the subnets. The idea is that Kim really only has one public key, and once Kim has trust in this one public key, which is a small amount of data, that should be enough to verify all the data that is coming from the Internet Computer.

The way this is done is that this one public key of the Internet Computer, represented in orange in the above slide, happens to be the public key of one particular subnet, which is called the root subnet. The root subnet is a bit like any other subnet: it has nodes, some of which might be malicious, most of which are honest. What distinguishes the root subnet is that the user knows its public key.

This root subnet also has, in its document, a listing of all the subnets of the Internet Computer. And as you can see, it lists that there is Subnet 5 on the Internet Computer, and it has the blue key.

Along with the response to the user, the subnet can include a redacted copy of that document. This is called the delegation, because it is delegating trust from the root subnet to Subnet 5. Now the user can look in this document and find out the public key of the subnet that the user is talking to (Subnet 5), and the user will find the blue key there. That’s how the problem of the user not needing to know every subnet’s public key, but rather just one single subnet’s public key, is solved.

That’s how a subnet certifies responses on behalf of the Internet Computer to what is called update calls. Update calls are a request from a user that goes to the Internet Computer, goes through consensus, gets included in blocks, etc. They can change the state of the service of the canister, which is of course important for booking a show, because you want the booking canister to record that the show has been booked. Eventually you get the response and the response is certified, meaning that the user can verify the content in the manner that I just outlined.

This whole process of certification happens completely transparently to the application code in the canister, and also the application code on the client side. Update calls are slow, however, because they have to go through consensus and so on, and this takes around two seconds or so.

Latency minimization is why query calls are also offered. Query calls are calls to methods that don’t change the state of the canister. Because they don’t change the state, there’s no need to put them in a deterministic order or to execute them on all the nodes that would then need the new state. A query call can be responded to by one single node, rather than the whole subnet.

This means that query calls can be very fast — a matter of milliseconds instead of seconds. But you may have noticed that there is now a problem. I’ve said that the subnet as a whole is trustworthy and that only the subnet as a whole can do this blue signature, but a single node on the subnet could be malicious. When that possibly malicious single node responds to your query call, how do you know that the response is correct? The single node can’t sign the response using the blue key because single nodes can’t sign alone. So how can you do a query call and still get an authenticated response from the Internet Computer?

This is possible using a feature that is called certified variables. An important element is that this requires collaboration from the canister. To certify update calls, canister code and application developers don’t have to do anything special; it just works out of the box. To secure these query calls and use certified variables to secure them now requires the canister to assist.

Here is what that looks like. When the user does the booking — note that we are now looking at the request where the user wants to book the show — the canister is recording in a special area of the system state that Kim has booked seat 5A. It is basically telling the system, “Please remember for me that Kim has booked seat 5A,” and this is now stored separately from the canister’s normal state. In fact, it’s written into the same document that records all of the responses that were seen earlier. Remember that the document is signed by the subnet as a whole using this blue ribbon.

This helps when the query call happens. Let’s say that a day later, Kim has forgotten her seat number and wants to use a query call to ask the canister what her seat is. A query call is a very nice use case for this because just checking which seat you had doesn’t require an update call. When processing the query, the canister can ask the system for a redacted copy of the document of the subnet with everything redacted except the information that the canister itself asked the system to remember (i.e., Kim has seat 5A). Now the canister can include what is called a certificate (i.e., the redacted document) in the response to Kim.

And now Kim can actually check that the response from the canister (“You have seat 5A”) is correct. Of course, this means checking the signature, but there is existing code for that. It is the same mechanism as checking for the validity of a response to an update call. Kim also needs to check that the info is not too old because maybe an attacker has given Kim a valid response, but it was only valid a week ago and things have changed. So there are a few things that the user (specifically, the client code on the user side) will have to check, but when they are being checked, it is seen that the response is correct.

A small complication arises due to the constraints of keeping the system simple. When designing a system like the Internet Computer, there is always a trade off. Is the system made a bit more complicated and its use (i.e., from a canister point of view) more convenient, or does it provide just a bare minimum of features and place some of the burden on the canisters so that the system can be kept simple, and therefore also secure and manageable?

In this case, only 32 bytes of storage are allowed for canisters to use the feature that I just outlined. Now 32 bytes might be enough to say that Kim has booked seat 5A, but once more than a few people are booking seats, then 32 bytes is no longer enough, but this is not a fundamental problem. The trick that can be applied to overcome this problem is pretty much the same that was applied just a few slides earlier when we were looking at how the subnet can use one signature for many responses. Again, one of these redactable documents is being used. So what can be done here is that the canister, when Kim books the show, stores the information that Kim has seat 5A in its own document. It also has other information about other people booking their seats in the show. It then calculates a cryptographic hash of the document, which uniquely identifies the content of the document, and it asks the subnet to store and to certify the cryptographic hash.

When Kim now asks the service for which seat she has, the service can respond with the document from the subnet, which includes information such as the cryptographic hash, and then respond with a redacted copy of the canister data, which verifies that the information that you have seat 5A is really correct. As before, this little delegation is needed, which connects the blue signature to the orange key that Kim has.

With all of these things in place, it is possible to trace the chain of trust from the response that the user gets to the key that the user already has. The user can check that this response is included in the redacted copy of the document from the canister and the lower left corner. It can recalculate the root hash or the cryptographic hash of that document and check that it is included in the redacted document from the subnet. It can check the blue signature on that document and make sure that this is really signed by the corresponding key for that subnet in a delegation, i.e., the redacted document from the roots up net in the lower right corner. That document is signed using the orange key. The user can check the signature against the public key that the user has. That is how the system can manage to certify responses that are delivered, and calculate it even from possibly malicious single nodes, deliver them to the user, and let the user still verify that the response is correct, using just one piece of information: the root of trust that the user needs to have in advance.

Let’s look a little bit closer at these documents with this interesting property that information can be redacted and the signatures can still be validated. The mechanism that is being used here is a well-known cryptographic tool called Merkle trees. For the sake of avoiding confusion, note that the term “redactable” in this blog post is an analogy. It does not refer to a redactable signature in the technical sense.

These are really just Merkle data structures, and this is nothing new. I’d like to explain at a slightly lower level how these Merkle trees work and how they can be used here. The analogy I initially used was a document, but another analogy works better here: a file system of your hard drive, where you have folders and lists of names. Inside these folders, there are more folders, as well as files that contain data. This is the conceptual model that I’d like to keep in mind when I talk about these trees. They are now tree-shaped, and the subnet maintains one of these trees. That’s why it is called the state tree, with all kinds of information. It has the responses to requests from users.

It also has the certified data that is looked at when query calls are certified, and then a bunch of other information, such as the current time and lots of internal data that are necessary to keep the entire Internet Computer running, such as cross-net messaging. So this tree is now tree-shaped. And then in this picture, I’ve drawn it growing down because in computer science trees grow downwards for some odd reason, and there are these corner things for data in the tree. So that’s where the certified data that the canisters store is; in the top left corner, there is the time when this tree was created. Now we want to define a cryptographic hash that uniquely identifies the whole tree, including all of the data, all of the tables on the nodes, and its shape.

As a little technical step towards that, the tree needs to be made a binary tree. As you can see, the tree is branching multiple times, and it’s nicer if the tree is actually binary. So every node has one or two children, and you can easily take the branching and replace it with multiple binary branching, which are little diamonds on the slides now to get that property. The next thing that needs to be done is to define how the cryptographic hash of that tree is calculated, and it can be done in a bottom-up approach. For the leaves that have data, the hash of the data, such as SHA-256, can just be used. Then there are these labels — for example, certified data, in the bottom left. And there, the hash of that sub-tree is a hash of the label and the hash of everything below.

For the binary nodes, something similar is done. The hash of one of these binary nodes is the hash of the left and right sub-tree. In this way, a rule has been defined to define a hash all the way from the bottom to the top, and there is a hash at the root. This way, a smallish 32 bytes number is now enough to identify the whole tree. If that hash is signed, that signature can be used to validate everything inside the tree. Next, redaction needs to take place. Let’s say if you want to expose to Kim that the certified data of Canister abcde-fgh happens to be CAFFEE. And maybe we also want to show that the current time is the time that we have them.

All of the sub-trees can then be removed that are not relevant for these two pieces of information. But the hash of that sub-tree that was removed, represented by the little orange annotations above, also has to be remembered. Why does that have to be done? Well, if you now give the pruned tree to the user, including all of the hashes for all of the places where something was pruned, the user can just as before from bottom to top recalculate all of the hashes for the individual nodes, and therefore also for the root node.

Then the blue signature of that root hash, the hash of the root node, can be taken and the user can validate the signature. And that’s how the user can validate a signature and sign the whole tree, even though the user doesn’t have the whole tree. So this is what is used in the subnets to have this document where information can be redacted, but it is used in more places — for example, the root subnet that is also using a state tree, but it has the additional information about which subnets exist and what the public keys are. And then in the example of the booking agency, where the query calls were certified, the canister itself might have one of these Merkle structures to include all the information about the seats that were booked behind one little small hash that it can then plug into the system as certified data.

That’s a summary of how Merkle trees work. It’s a very versatile tool, and it can be used to good effect in the Internet Computer.

Lastly, I want to walk you through the steps that you have to take as a canister developer, or rather an application developer, on the Internet Computer if you want to get the benefit of certified data to secure your query calls. This needs some help from the canister, and there is some amount of development you will have to do. First you have to think about the kind of queries you need to secure, and think about what Merkleized data structure is right for that. I’ve shown you the tree and the tree is really useful for everything that is a key value lookup, or that resembles a file system. If you have something more sophisticated, where some data needs to be aggregated or selected, maybe you need something different than that. It really depends on your application. In a simple case, a tree like that would work. Then you have to maintain that tree as part of your normal canister state. So this is running in the main memory of your program running on the Internet Computer, and whenever an update call comes in that changes a piece of state that needs to be reflected there, you have to update that Merkle structure. You have to recalculate the root hash that identifies all the data in that structure and write that out as the certified data of your canister to the system.

When handling a query method, you have to calculate the response to the query method, then you have to look into your in-canister Merkle structure and redact everything that you don’t need. Basically, you calculate what’s sometimes called a “witness” that proves that the response that you are actually in the process of giving is contained in the hash that you stored earlier. You also have to ask the system to give you that certificate, the redacted document, which shows that the hash you wrote earlier during the last update call is the one that you expected. And then you ship to the user the response, the redacted part of your in-canister Merkle structure, and proof that the root hash is the root hash of your data structure.

With regards to the client code, you would be expected to write a dedicated client for your service, because there are some server specific things you have to do. First you check that the system certificate is correct, otherwise a malicious node could give you an old result and make you believe that something is true now that maybe was true a week ago, but is no longer true. Then you have to check that this is really the data for the canister with which you are communicating. Otherwise an attacker could install another canister on that subnet that has a similar state, like a fake booking agency that has the information that you’ve got seat 7D instead of seat 5A, and construct a response based on that.

So you have to check if this is the canister that you are actually talking to. Then you look at the application-specific data structure from the redacted Merkle tree you received from the canister. You recalculate its root hash and you compare that with the hash that’s in the system data. Finally, you have to check that what’s in the application-specific data matches your query parameters and response, and both are important. You have to keep in mind the parameters of the query and also the response. When you have done all these checks as part of the client code, you can then trust the response, even though it came from a possibly untrustworthy node.

For update calls, as noted earlier, you don’t have to worry about any of this. So if this is too complicated, or if maybe your queries don’t fit the format that is supported by Merkle structures, you still have the option of either using query calls and living with reduced security guarantees, or you can use update codes and live with slightly reduced performance. We are working on ways of making query calls more secure out of the box so that even in cases where you don’t do anything special as a developer, you get some more guarantees and some protection against malicious nodes. This is currently in the works, so stay tuned for more information.

Until then, if you want to know more about the technical details of what I have presented, you will find them all in the Internet Computer Interface Specification under the “The System State Tree,” “Certification,” and “Certified Data” sections.

If you have any further questions, we’re happy to help you on the Developer Forum to support you building innovative applications on the Internet Computer.
____

Start building at smartcontracts.org and join our developer community at forum.dfinity.org.

--

--

DFINITY
The Internet Computer Review

The Internet Computer is a revolutionary blockchain that hosts unlimited data and computation on-chain. Build scalable Web3 dapps, DeFi, games, and more.