Corda’s new network map infrastructure
Exploring design decisions
- Why we don’t ship a network map server anymore.
- The new network bootstrapping tool and planned upgrades.
- Information about an upcoming refresh of the testnet.
- Background information on the design thinking that went into the change.
In Corda 3 we changed how the network map infrastructure works. This was the last step before we felt we could commit to the network protocol for the long term, as the prior design had always been temporary. The changes we made along with the reasons for them are described in this article by Joel Dudley.
The aspect of the re-design that has raised the most questions is the approach taken to creating a new compatibility zone (what people think of informally as a ‘Corda network’). In Corda 1 and 2 the zone concept was incomplete, so creating a new zone was straightforward — just start a network map node and connect the other nodes to it. In Corda 3 this changed significantly. In the new design the network map service is no longer a regular Corda node: it’s become a collection of signed files served over HTTP or loaded from the filesystem instead. A simple REST-like protocol is used to fetch network map entries from a network map service, and another HTTP based service is used to receive certificate signing requests (this is the doorman and is what makes a Corda zone a permissioned system). Also, this release introduces the notion of network parameters.
However, Corda 3 does not provide implementations of either server. As a consequence it may initially appear that you cannot create a new zone out of the box. This isn’t the case, as we do provide a ‘bootstrapper’ tool via a Gradle plugin that generates network map and certificate files that can be copied into each node directory. Future versions of Corda will improve this tool significantly and disconnect it from Gradle. By using it you can configure a new zone and distribute the resulting files via whatever means you have available.
But why is it done in this roundabout way? Why don’t we just provide implementations of the two servers? Given the description above it’s reasonable to assume that Corda would come with an implementation of the network map server and doorman, yet it doesn’t. Are we being deliberately awkward?
Well, no. The reasons for this design are somewhat subtle and may not be immediately apparent unless you’ve built peer to peer networks before. Time to explain.
We do it this way for two reasons:
- Different server implementations would have little shared code.
- It would cause people to do testing wrong.
The doorman is just a front to a certificate authority. It accepts POSTs of PKCS#10 certificate requests, and returns a string that can be used to poll the server until a zip file of certificates is ready (the certificate chain). That’s it — from the node’s perspective that’s all it does. Now behind the scenes a doorman may be quite sophisticated. It may be linked to ID verification infrastructure, ticketing systems, setup of billing processes, “CA in a box” solutions, hardware security modules, signing of legal contracts and so on. But all of that business logic is specific to the operator of a zone. Each zone will be different and there’s essentially nothing that can be shared between implementations, beyond a tiny bit of REST scaffolding that any developer can knock out in five minutes.
In essence the doorman is a policy plugin. By implementing the protocol you implement the membership policies of a zone.
The network map is a bit more sophisticated, but not by much. It collects signed NodeInfo structures, saves them to disk, and serves them again. It also serves a “network parameters” file. Every so often nodes POST their NodeInfos to refresh them, and if a node doesn’t do that for a while then their entry is dropped. The exact protocol is explained in the docs. Implementing this is mostly a question of wiring it up to your preferred CDN and again, is a straightforward programming task for anyone who can write basic web servers. The bulk of the code is pure business logic related to removal of pre-approved members and control of network parameter changes.
OK — so the servers don’t have much logic in them and can easily be implemented. But we could still provide example servers. We don’t and we don’t for a reason, which brings us to the second aspect of this design.
Directory vs presence
The Corda network map service is designed to superficially resemble DNS, with the following differences:
- It uses X.500 names instead of domain names. This format is useful for representing the names of legal entities, which are not globally unique, use non-Latin scripts and require upper/lowercase preservation. You don’t have to identify Corda nodes using legally registered organisation names in your own zones, but we want to identify them this way on the main Corda network.
- It can easily ship extra metadata like certificates.
- Entries are signed.
We could have tried to piggyback on DNSSEC but DNS is a very old protocol with various problems, and there’s really no need to do that.
That’s how it differs, but what about how it’s the same?
The network map is a directory service, not a presence service.
A presence service is what instant messaging networks provide: the servers know if the users are online or offline. They maintain and broadcast this information to other users. The dataset is always fresh, but the cost of this is that presence servers must be highly available. An outage disrupts everything.
A directory service is more like a phone book. You can look up entries in it, but that tells you nothing about whether the person is actually reachable at that moment. You might call and go to voicemail. The number may not even exist anymore. A directory is highly replicated and highly available — there is no such thing as an outage of the phone book. But you pay for it with freshness: it takes a long time for dead entries to disappear.
Sharp readers may have noticed that there isn’t a bright line between the two things: even phone books delete closed businesses from time to time, so offer some kind of presence information, and IM presence servers don’t always react immediately when someone goes offline. The difference is really two ends of a spectrum.
Corda is designed to be a decentralised system so it prefers to “phone home” to any given server as little as possible. Therefore our network map service is more of a directory server than a presence server. When a node starts up for the first time it generates signing keys and applies for a certificate from your selected zone’s doorman. The doorman checks the name is unique and valid, and depending on the zone rules may do other checks e.g. ID verification.
Once a node has got a signed certificate it can push its NodeInfo metadata to the network map, but there’s no always-on link to a high-availability server so — like with DNS — data changes will propagate to nodes at different times, depending on their polling interval. The actual data can be served by HTTP caching proxies like major content delivery networks, making it robust to DoS attacks and highly scalable at very low cost.
Tests and synchronisation barriers
The distinction between directory and presence servers may appear academic at first …. ok, some kinds of system react fast and others don’t, so what? But there’s one very common situation where it makes a huge difference: automated testing.
Corda distinguished itself early on in life by providing test frameworks, in a market where many blockchain platforms did not. Of course you get to start with tooling like JUnit and the IDE integrations that make it convenient. But then on top you can use the Corda ledger DSL,
MockNetwork and integration test driver to test your software at different levels of granularity. It’s an area we’ll continue to invest in.
Testing distributed systems can be tricky and peer to peer systems can be especially hard. Imagine if you implemented the protocols above in order to run tests in continuous integration. Here is an example of a problem you’d encounter:
- Nodes would start up in parallel, and therefore register with the doorman and network map service in parallel.
- Therefore some nodes would poll the network map before all nodes had finished registering.
- Therefore nodes would start with half complete maps depending on the vagaries of timing. They would eventually re-poll and all get in sync, but it would happen asynchronously. This is a race condition that would cause exceptions in your code when you looked up identities that were expected to be there but weren’t.
The obvious first attempt at a fix is to bump the polling interval on your nodes much lower, so they are constantly querying the network map to find out whether nodes are online yet. But the map isn’t a presence service, so the Corda APIs aren’t designed to block if an entry requested by name isn’t found. Instead they throw exceptions. This is reasonable in the intended design, where the map service is highly available and entries are controlled by different companies. If an entry isn’t found that probably indicates a typo rather than “my counterparty node is still starting up”. But it will cause you problems in simulation environments. And no matter how low you set the polling interval you’ll still have race conditions and arbitrary sleeps trying to fix them scattered all over your code.
The real problem here is lack of a synchronisation barrier. What you want is for all the nodes to generate their certificates, then register, then wait for all the other test nodes to finish registering, and then you move onto the next part of the test. Implementing these sorts of barriers isn’t too hard, and can be done using Corda RPC if you have an external control process that you wrote.
So imagine you do that. It takes a bit of work but no big deal. At some point you might be tempted to create a really big network to see how well it scales. You’ll then encounter another problem:
- All the nodes start in parallel.
- So they all hit your test doorman and map server simultaneously.
- Your little doorman probably isn’t tuned for scalability, so it will then get slow and nodes will time out during startup.
- So some nodes won’t start successfully and flows started on other nodes will pause or abort because they can’t find their counter-parties. Failed again.
Once more, the problem is that your testing environment is quite different from a real production environment and moreover, that difference is fundamental. It can’t easily be tweaked out of existence.
There’s another problem.
Misleading test environments
It’s easy to accidentally confuse developers by creating a test environment that differs from production. Imagine that one member of your team has written a simple stub network map which makes nodes poll every 5 seconds, because that’s convenient for automated tests. Then you provide that test environment to developers to experiment with. Some of them will notice that network map lookups act a lot like an IM presence server and start to use it that way. For instance they may incorporate online/offline indicators into their UI, or they may implement their own code to store tasks to disk and retry starting flows over and over again if the target node is offline.
This won’t just waste developer time on an unneeded feature the platform already provides. It will confuse developers further when they discover behaviour in production isn’t the same as behaviour in test. In a production environment you can start flows with a node that’s currently being restarted or is offline temporarily, and the flow framework will take care of storing what’s happening to disk and retrying later. But for that to work the target node must be in the network map the entire time. This is fine if it’s a directory but not if it’s a presence server … yet the most obvious path to a test environment is to try and turn the zone infrastructure into a presence service.
These were some of the issues on my mind when I redesigned the network map for Corda 3. I wanted to make test environments easy for developers to create. I also wanted to remove hidden race conditions, scaling problems and the potential for developer confusion over what the network map really is. And because we plan for the Corda production zone to be accessible to everyone, we didn’t want to make people think they had to write their own servers to use Corda.
The solution chosen works like this:
Create a shared network drive between all the virtual machines in your test network. Pre-create all the NodeInfos and certificates they need based on a simple template, copy them to the shared drive. Wait for the copy to complete (this is a sync barrier). Now start up your nodes. They read network map data from this directory and a complete network map is available to all nodes regardless of actual start ordering.
Scalability of startup depends on scalability of the shared drive solution, but there are many out-of-the-box file sharing systems that scale well, especially in the cloud.
In Corda 3.1 the Gradle plugin automates some of this process, but it leaves it up to you to do the rest. We are working to improve the bootstrapper in the following ways (here’s the pull request):
- Standalone tool, not only accessible via Gradle.
- Support for creating and uploading Docker images to Azure, configure the shared drive and create or alter a running test network for you in the cloud based on a single command line invocation.
- Usability improvements and bug fixes to make it easier to use.
- A generic Linux backend for users who don’t want to use Azure.
- Eventually, exporting an API to make it easy to script the creation and deletion of full test networks with many different nodes and service types.
Some improvements are underway now, and others will ship over time in future open source releases.
The public test network
Since even before Corda 1.0 was available we have provided various Corda test networks (or ‘testnets’ for short). These are live networks you can join and interact with other nodes on. They’re essentially the same as production networks but with slightly different rules (like less id verification), and no guarantees they’ll stick around for the long term. And because we didn’t stabilise the wire protocol until Corda 3.0 we have in fact torn down and rebuilt the Corda testnet several times.
We’re getting ready to launch the latest revamp, and hopefully it’ll be the last time we reset the testnet. The new testnet is significantly easier to use and comes with a simple UI for creating nodes on it. Once the new testnet is available you will be able to do inter-firm testing in an environment almost exactly like the real Corda network, but without any consequences if anything goes wrong. This should provide the last link in the testing chain. We hope it will be a satisfactory alternative to building custom zones.
Corda 3 changed the network map infrastructure in major ways, introducing many new features and redesigns intended to lay the groundwork for robust, scalable and governable zones. The dust from this change is still settling and we have some work to do on automating and simplifying the creation of Corda zones especially in test environments. But I hope you now have a better understanding of our motivations and where we’re going with this.