Gavin Wood recently presented “A Tale of Two Technologies” at Web3 Summit in Berlin. In an effort to make the ideas presented more digestible, we’ve transcribed the video, included the slides, and edited down the text to its core ideas.
After presenting Substrate and Polkadot, Gavin builds a custom blockchain in minutes using Substrate on a brand new MacBook Pro.
- Watch the full video
- View the slides
- Watch the Dotcon-0 workshop
- Substrate Developer Hub & Documentation
- Join the Substrate chat
What is Substrate and how does it relate to Polkadot?
I think there’s some confusion over what Substrate is and how it relates to Polkadot. This is meant to clear up some of that confusion. So how do they relate to each other?
Basically, Polkadot is, in many respects, the biggest bet in this ecosystem against chain maximalism. I don’t like chain maximalism in general. I consider it a sort of nationalist equivalent of blockchain. If I can do something helpful for this ecosystem, I would try to convince people that it’s really not such a good plan to back one winner above all others. Even if there were one perfect chain, I don’t think it would stay perfect for very long, and I don’t think it would be helpful in general for people to try to create one. I consider this kind of maximalism basically just barriers to entry. Barriers to entry generally reduce innovation, and they reduce the fun for technologists.
So, how do Substrate and Polkadot relate to each other? Well, these are separate technologies. They’re designed to work together, but really, it’s more of an optimization that they work together. It’s not something fundamental about their design.
If you want to think about it in analogies, you can think of Polkadot being more like the ethernet protocol and Substrate being more like the general IBM PC or Mac or whatever that uses the ethernet protocol in order to communicate between themselves. Polkadot is a specific protocol. There’s a token associated with it that lets you pay for using that protocol. It’s got many teams building, at least a couple of teams at the moment and hopefully many more, and multiple implementations. (Interested in building an implementation? Apply for a grant.)
Substrate, on the other hand, is a technology stack. It’s created originally by Parity Technologies, although since it is open source, I hope that we can sort of attract other people to contribute. We already have a few external community contributors. I hope we can get many more over the coming years. The idea with Substrate is that it’s not used for any single network. It’s used for many different chains with many different tokens and, in principle, could also be re-implemented in other languages.
Looking at the Venn diagram, on the one hand you have Polkadot and the parachains of Polkadot. They don’t need to be written on a Substrate stack, but they can be. We’ll make it easy for you to do that but hopefully there will be other technologies that come along and that allow you to make Polkadot chains without using Substrate. Similarly for Substrate, the chains that you build with Substrate don’t need to be used with Polkadot, we make it easy for them to be used in Polkadot, but in principle they can exist perfectly well on their own.
This is a somewhat complex schematic. I don’t expect you to understand all of it, but the basic point is that in both the Polkadot protocol and in the Substrate model you have this notion of an embedded runtime, or block validation function in the case of Polkadot. This is meant to be relatively analogous code, they do more or less the same thing, and because of this, we can start to see the sort of the analogous components between both Polkadot and Substrate.
So, to give an example, both Polkadot and the parachains need both an RPC, they both need database, they both need synchronization algorithms, they both have a WebAssembly execution engine. These are common components, why should we build them twice — once for parachains, and once for the relay chain? Doesn’t make any sense.
If we go to the model of Polkadot version one — and this is the thing that I basically laid out in the Polkadot paper, about two years ago — the basic idea is that you have the consensus algorithm at the very bottom, that’s whatever it is that’s making sure that the chains don’t fork off away from each other. You’ve got the Polkadot runtime environment, which is the thing that looks after execution of the block and knows how to run the transactions. Then you’ve got the relay chain’s runtime, that’s the thing that actually describes what Polkadot is, how it works to the runtime environment. Then you’ve got the things that run sort of within the runtime, like the parachain consensus, making sure that the parachains are all correctly operating, that they’re valid and so forth. That’s on the relay chain side of things. On the parachain side of things you’ve got the authoring and the synchronization mechanisms that actually bring forth the new blocks for the parachains to keep getting bigger, and that runs the parachain runtime environment and then the actual parachain runtime is the thing that decides what it is that the parachain, how it works, how it executes transactions, what the transactions mean, and how it executes blocks.
If we go to the version two model of Polkadot, then we see that basically we’re going to merge the parachain runtime and the Polkadot relay chain runtime into the same bit of code. When they become the same bit of code, or more importantly when all of the components around it can interpret them in the same way, then we can put a feedback loop between the parachain runtime and the relay chain runtime. That feedback loop is super important. It has a very particular thing that it achieves. It relates specifically to scalability. Version one of Polkadot can probably handle dozens of parachains, perhaps a 100.
In version two of Polkadot the relay chains can be pluralistic where each of the parachains can itself be a relay chain, then we can introduce multiple levels of Polkadot. We end with something approaching this. We might have sort of the main relay chain at the top there. But on the first level there could be other relay chains, copies of that relay chain in terms of its operation. Those relay chains can then host their own 100 or whatever parachains. Some of those parachains might be relay chains as well. Others might be interesting state transition function chains, like maybe a plasma parachain. Maybe an Ethereum 2.0 parachain. Maybe a ZK-Snarks parachain. Who knows.
Maybe you get to the bottom and you have a relay chain that’s just full of parachains that are all for one specific consortium. So maybe something like an EWF set of chains but all basically run in a zone together.
Now in building Polkadot and in building Substrate, we realized that we will fail in our mission of building a platform that is inclusive if we accidentally come into competition with other next-generation chains. We don’t want that to happen.
I believe that the difference between competition and cooperation is pretty much based in the technology, and specifically, based in the neutrality and freedom that you get from working on a platform. If a platform gives you the freedom to do what you would do otherwise, but yet, some additional advantages, perhaps, a speedier development, then you’re more likely to use it.
Substrate is general
We’re designing Substrate to have the maximum level of technical freedom, and you also get security and connectivity built in for free — with minimum efforts and constraints, and for that matter, the opinionation that can creep in if you’re not careful when you design these platforms. I want to try and convince you that Substrate really is a general platform for building your next blockchain on.
Why do I think this is a really general platform? Well, firstly, minor things like the block format. We have an abstract block format. As long as your block format sort of encodes the two or three or four specific values, things like parent hash, then you’re good. You can put whatever else in it that you want. It’s extensible.
It’s agnostic to the underlying crypto databases. You can choose our very Ethereum-like Merkle Patricia Tree. You can have another, any of the other crypto databases that we write for the various parachains that we want to make. You can introduce your own, that’s perfectly fine. You can organize these dynamically in any way that you want. If you want to have hierarchical trees, then you’re perfectly at liberty to do that.
It’s got an execute block function. Now, in general, blockchains have an execute block function. That’s one of the sort of notable constants of blockchains. And we provide a 100% abstract execute block function. You give it data, like a block, and it executes it. It’s encoded in Web Assembly, which can be targeted from anyone of a number of languages including C++ and Rust. I don’t think it could be possible to make a more general execute block API. But if you have ideas, let me know. It also has extensible networking and extensible CLI and RPC. So if you want to put in interesting things for managing your networking, maybe managing peer sets, maybe introducing your CLI options, introducing your RPC calls, it’s all good.
What about consensus? Well, we also have that generalized as well. There’s an API provided. You can sort of roll your own consensus mechanism if you want. We’ve taken quite a lot of time and iteration to make sure that this API can handle probably most interesting consensus algorithms out there. We also provide a few of our own. Rhododendron, which is our blockchain context of PBFT, is in there. Aurand, which you might have heard of if you’ve been using Kovan or any of the sort of a parity POA tools, is probabilistic finality consensus algorithm. We’ve also included GRANDPA, our new consensus algorithm that’s going to power Polkadot, and we’re planning on adding Ouroboros and Proof-of-Work.
If I’ve convinced you it’s general, then now I need to convince you of its advantages. So what do you get? Well, you get hot-swappable, pluggable consensus. You can change your consensus mechanism on the fly. So if you want to start it with Aurand and then you want to add finality down the line with GRANDPA, that’s perfectly fine. You want to switch it from that to become a parachain? Well, we want to make that work as well.
It’s got a hot-upgradable, pluggable state transition function. What do I mean by hot-upgradable? Well, I mean that you can launch your chain and then down the line you can change what it is that your chain does, how it works without a hard fork.
You’ve got things like accounts and balances so that you can make a cryptocurrency. That was one of the many use cases. You can also use sessions. There’s a staking algorithm so you can use proof of stake. It’s got things like treasuries, which is a little bit like the DAO. It’s got a smart contract module, so if you want to put smart contracts in there, you just include that. It’s got various things to do with on-chain governance. At the moment we have a couple of modules for referendums and managing a sort of delegated council. You want to add arbitrary fungible assets, we’ve got a module for that as well. But, as I said, these things are being constantly developed and we want to turn this into a very reasonably large library.
So the problem is that when you’re coming up with these platforms, these APIs and protocols, you end up with a trade-off. On the one end you’ve got like minimum effort like, I want everything just done for me. I can configure it with a JSON file and like choose which bits I want and that’s it, all sorted. Then on the other hand, you also want to provide maximum freedom, which is like well, I really want to do X, Y, and Z, and maybe you didn’t think of that when you designed the protocol, when you design the APIs, so how is it that I can do that.
Really, we want to get both: maximum freedom with minimum effort. The way to get both at least that we’re using is to have a multi-layered architecture. At the top level there is where you’ve got like the maximum amount of freedom but also the maximum amount of effort that you have to put in. That’s for stuff basically you start with Substrate Core. That gives you a bunch of stuff but it makes your life of building a blockchain a lot easier than it would be if you started from scratch. But you don’t get anything really finally made for you.
In the middle you’ve got the Runtime Module Library. So that basically means that you can use Core, add the runtime modules to it, and you end up with something that’s much easier to get something working, and still has a decent amount of freedom. You can make your own runtime modules if you want. You can hack up a code if you want, that’s fine. But it is more opinionated so you have to fall in with the architectural decisions that we made in that library.
Finally, that is Substrate Node, which is our sort of maximum ease, maximum opinionation. You configure it with a JSON file, it kind of does what you expect. It deploys a blockchain pretty easily.
If you have the Bare Polkadot sort of parachain, then you give a block validation function, which has to be in WebAssembly. Then you give it some collator nodes, which basically provide these candidate blocks and you have to do everything. You have to do RPC databases, syncing, all of that stuff.
If you use Substrate Core, then you start to provide this execute block function written in basically in Wasm. You can write in Rust, C++ whatever you want, but you still have to sort of write it all yourself. And you also have to provide any networking block authoring stuff if what we give you isn’t sufficient, and you get a bunch of stuff. You get RPCs, databases. You get telemetry. You get the light client, pluggable consensus, upgradability and a bunch of other things.
If you go for the Substrate Runtime Module Library, then you get a bunch of modules to select from, but you have to select them, you have to configure them, give them parameters, and you get a load of front-end help. You get block authoring and transaction queue. You get the ability to have a JSON configuration file automatically made for you that just works with your eventual executable. You get a chain explorer that will work with all of your runtime modules. You get event tracking, which I’ll probably touch on in the workshop.
Finally, if you just use Substrate Node, then you provide a JSON config file and you get a blockchain.
So, with regards to Polkadot and Substrate, there are basically three options that you get. You get a solo chain, a solo chain and a bridge, or you get a parachain. There are differing levels of sovereignty and connectedness.
Solo chain is self-sovereign. It doesn’t connect to anything else. It just lives on its own.
Solo chain with a bridge to it (basically bridges are modules the solo chain listens to) can listen to Polkadot, and basically provides a level of intercommunication between the two. It retains its sovereignty and it’s up to the bridge to basically explain to Polkadot what that sovereignty means. Problem with it, of course, is you don’t get the security that Polkadot would otherwise offer.
Finally, as a parachain, you get to use the relay chain’s consensus or any of whichever one that you put it on, if it’s not the top-level relay chain and it uses the validation for that. You don’t need to incentivize your users economically. That job’s already done because you’re piggybacking on the relay chain’s consensus. It’s sovereign over state transition still. It’s just not sovereign over finality.
Next I want to very quickly talk about parachain set governance with Polkadot, since that seems to be something that people are quite interested in knowing. Version one of Polkadot probably hosts several dozen parachains, perhaps, let us say perhaps up to 100. So, that’s limited capacity, 10 or 20 slots to be reserved for some of the future chains, leaving perhaps 80, and they’ll be leased. So, it’s not a sale, it’s a lease. You put in some DOTs and you get the DOTs back when the lease is over. The idea is that there’s a quadratic curve. It’s a polynomial curve and it goes up as fewer parachain slots become free.
Parachains can be added through edict of referenda. So, you can just have a referendum to say, “Yeah, we, the people, the coin holders, believe this parachain should be introduced.” Or you can take up one of those slots by putting down DOT tokens, and you specify desired lease period. Once the leases can be exchanged, i.e. if a parachain comes to the end of its period you can create an arrangement whereby you take the lease off them, and once it’s ended, the chain is retired. If all of the slots are leased, it goes to an auction. If they’re not, then it just gets added back into the free slots and the effective price for taking a parachain goes down.
So, the roadmap: We’re very close to releasing PoC-3, Substrate is going to have a release once we’ve integrated our new consensus, and that will be called the 1.0 Beta. PoC-4 is scheduled for some time between January and February next year, and that will include inter-chain communication, and we are also planning on making the 1.0 release candidate for Substrate, with the 1.0 candidate following basically after the key components have been audited.
Things get a little bit more difficult to predict the future, but basically PoC-5 should see something of a bringing together of Polkadot parachain collator software and the Substrate software. So basically the Substrate can start to be the collator nodes that can start off a parachain consensus. While that’s going on, we’ll be working on the things like the Ethereum bridge and a few other infrastructure components with a hopeful date for a 1.0 release candidate beginning of July next year and still on for a genesis block at some point between September and October. So get Substrate and launch your chain.