The Substrate Guide I Wish I Had

Fractal’s blockchain lead Shelby Doolittle breaks down the core principles of Substrate to illustrate how we approach open-source development of Fractal Protocol for the world’s data pool.

Fractal ID Team
Fractal ID
Published in
8 min readJul 6, 2021

--

At Fractal, we’re working on building the world’s data pool. We are achieving that primarily using a decentralized protocol for data access that allows users to host and control how their data is used. Being a decentralized protocol, we have many use cases for a blockchain.

We chose to use Substrate to build our blockchain primarily for the following reasons:

  1. Rust — I am a big fan of Rust and believe it has many benefits for improving the security of a blockchain.
  2. Open source — It’s critical for any open source project (which ours is) to build on open source foundations, so anyone can build or fork the whole project.
  3. Easy API for common code — Most code is written in pallets, which have a nice and simple API for defining logic.
  4. Full-featured — Substrate handles all aspects of a blockchain. It makes little sense to implement all the networking and fork resolution which Substrate already handles.

While Substrate is powerful, that power comes at the cost of some complexity when starting out. The APIs do a good job of hiding some of the complexity, but one still needs a good baseline in how the framework operates to decide how to approach a feature.

This is the guide I wish I had when I was just starting out. I hope to lay out a simple overview of the features critical to most development, so you can skip a day or two of poking around documentation trying to get the whole picture. I won’t be including many details about how to do everything I mention; search their official documentation for relevant keywords for more details.

Major Components

The Substrate framework has three top-level pieces to it: the Node, the Runtime, and Pallets.

Node

The Node is the binary that a machine will run to participate in the resulting network. It includes a native version of the Runtime, along with all the networking logic for communication with other nodes.

I have only had to change the node beyond the default template for the configuration of the Pallet that implements my logic. The Node directory includes the specific configuration for different environments/chains that use the same logic (devnet, testnet, mainnet). So you configure your Pallets for each environment in the Node.

Runtime

Getting closer to the actual “blockchain” is the runtime. This is a Rust crate that represents the entire “logic” of the chain. This is compiled to both WASM (for on-chain upgrades) and native (for better performance when running a node).

Your Runtime is declared with theconstruct_runtime!macro. That macro includes the various Pallets your Runtime uses to define its logic.

Similar to the Node, the actual code defining the Runtime doesn’t change much, mostly when the underlying Pallets you’re using require more injected types (that part is covered later).

Pallet(s)

Your Pallet(s) are where most of the logic of your chain will reside. Pallets are Substrates terms for plugins, extensions, addons, etc. Pallets contain the actual procedural code that runs when users interact with your chain. Pallets also have mechanisms for interacting with other Pallets.

Most of the rest of this guide will be covering the different parts of Pallets (which I’ve had to interact with, so it’s not comprehensive).

Pallets

Pallets can define many different ways of interacting with and extending a blockchain. This mostly happens through procedural macros, like# [pallet::storage]above a type definition.

If your code uses decl_storage! { … } and other similar constructs, you’re using the old version of the macros and should strongly consider upgrading. The main difference I see is that the procedural macros still work with automatic formatting. That reason alone is enough for me. I don’t know if there are any incompatibilities.

Extrinsics

The most common thing you’ll be interacting with is “Extrinsics”. This is the name for how users will be interacting with your chain. You define a function on your Pallet type (using #[pallet::call])and any user can now “call” that function on your blockchain.

Extrinsics have a “weight” associated with them. This is a combination of the compute, storage, and I/O needed to execute the code. Most extrinsics use a fixed amount of these resources and can return the DispatchResulttype with a weight annotated in a procedural macro. For extrinsics that have a variable cost on any of the axes, you can return a DispatchResultWithPostInfoand return the weight of the extrinsic.

You can have arguments to your Extrinsic, so long as they can be serialized and deserialized with the SCALE codec. Most of the time, you can #[derive(Encode, Decode)] and get something close to the optimal encoding of your struct/enum (in terms of bytes). Numbers that are usually small-ish, should be encoded with the Compact<N>wrapper, which will encode them more efficiently for smaller numbers.

Extrinsics should not panic, or else malicious users could attack the network by triggering compute with no associated cost. This basically means: be extra careful when using unwrap. Also, consider using the no-paniccrate to force the compiler to prove that functions don’t panic.

Storage

Extrinsics cover execution and logic, how do we store state? Appropriately, this is done with “Storage”. Each chain has one storage tree, with the root hash of the storage being included in the block header. Usually, you interact with the storage using types annotated with #[pallet::storage].

These types can be:

  • Values
  • Maps
  • Double maps (two keys to access a value)
  • Multi maps, i.e. a fixed, but arbitrary, number of keys to access a value (please note: this is mentioned in the docs, but doesn’t seem to exist in the relevant library)

Each storage type you define exists as a child in the storage tree. So when you’re accessing MyStorage::get("foo")you’re effectively getting the value at root -> (pallet id maybe)? -> id("MyStorage") -> id("foo"). That’s certainly not the actual algorithm, but it shows the point.

Storage values are accessed using methods on the type directly, not an instance of the type. So use MyStorageValue::get()instead of my_storage_value.get(). This gets to the slightly weird part of Substrate that everything operates on effectively a global scope. Extrinsics don’t take a self parameter. This makes testing a little difficult since you are forced to set up a whole execution context, instead of injecting just what’s necessary to run your test. It’s also a little weird to get used to interacting with types instead of instances. Overall, it’s not that bad, just a bit strange in my personal take.

Also, the docs don’t really show it, but you can change the last argument of a #[pallet::storage]type from ValueQueryto OptionQueryto have missing values return Noneinstead of their default. OptionQuery should probably be the default since it’s easy enough to unwrap_or_default.

Storage is the primary way users will query the state of the blockchain. Since the state root is included in the block, users can verify a query by downloading the relevant subtree and sibling hashes. This state storage is much cheaper than on-chain storage, so consider making additional types specifically for querying.

Config

The last, most notable part of a Pallet is the “Config”. This acts much like Dependency Injection, insofar as you declare the types your Pallet requires to operate, and the Runtime is expected to provide those types.

For example, the first Pallet I made for the Fractal Protocol dealt with minting tokens to users. So the Pallet needed something that looked like a Currency. We used the “Balances” crate for that, but anything that implemented the Currencytrait could have been used. The Pallet doesn’t need anything specific, so it just asks for something that acts like a currency.

Configuration constants are also included as types in the Config. This seems to be required mostly because Rust didn’t have stable const generics when Substrate was being built. Perhaps future versions will use const generics here instead. The Fractal Protocol minting Pallet takes constants for how frequently minting should happen (in blocks) and the maximum amount to mint.

Misc

You can run code when the chain is initialized with the GenesisBuild trait and procedural macro. This is useful if you want the Node/Runtime to be able to configure something that doesn’t make sense in the normal typed Config, or if you need to run more complex code for the initial state.

You can add custom Error messages using the #[pallet::error]procedural macro. These can easily be returned from your Extrinsics.

You can hook into various lifetime events related to the chain using #[pallet::hooks]. The Fractal Protocol uses this to perform minting at the end of every N blocks (in the on_finalizehook).

You can publish events using the #[pallet::event]macro.

If you’re looking to do any signature checking yourself, use the sp_corecrate. While there is signature checking support in other crates, it is difficult to get them to work in theno_stdenvironment your Pallet needs to run in.

Conclusion

Hopefully, this helped you understand the basic pieces of Substrate. The Substrate documentation is very thorough, so read more info there.

I’ll be publishing technical posts about Fractal Protocol monthly. Some potential future topics:

  • Our custom codec for data — SIER (Static, runtime Interpreted, Evolutionary, Reflective)
  • Data extension proofs using Merkle Mountain Ranges
  • Plans for a high-scale sidechain

If you’re interested in what we’re working on, our project is open source, please see this link here. We do as much as makes sense in the open while trying to execute our vision as efficiently as possible with a small but dedicated team. Please reach out to us using our social channels (e.g. Telegram) if you want to contribute to our mission.

About Fractal Protocol

Fractal Protocol is an open-source, zero-margin protocol that defines a basic standard to exchange user information in a fair and open way, ensuring a high-quality version of the free internet. In its first version, it is designed to replace the ad cookie and give users back control over their data.

Make sure to:

This article does not include elements of any contractual relationship. This article shall not be deemed to constitute a prospectus of any sort or a solicitation for investment or investment advice; nor does it in any way pertain to an offering or a solicitation of an offer to buy any securities in any jurisdiction.

For the avoidance of doubt, please note that the Protocol has not been fully developed. Any statements made about the Protocol are forward-looking statements that merely reflect Fractal’s intention for the functioning of the Protocol. There are known and unknown risks that can cause the results to differ from the forward-looking statements.

Fractal does not intend to express investment, financial, legal, tax, or any other advice, and any conclusions drawn from statements in this article or otherwise made by Fractal shall not be deemed to constitute advice in any jurisdiction.

Fractal’s intended purpose of the Tokens is to be used as means of payment for the services that will be offered within the Protocol (the “Services”). The purchase, ownership, receipt, or possession of Tokens carries no rights, express or implied, other than the right to use Tokens as a means to enable usage of Services in accordance with the then-applicable terms of use relating to the Services offered within the Protocol. The Tokens do not represent or confer any ownership right or stake, share, security, or equivalent rights, or any right to receive future revenue shares, intellectual property rights, or any other form of participation in or relating to the Protocol, Fractal, Service Providers or any of their corporate affiliates, other than any rights relating to the provision and receipt of Services, subject to the applicable terms, conditions or policies that may be adopted by participants in the Protocol.

--

--