Dilithium: Subspace Consensus v2

Episode III — The Hitchhiker’s Guide to Subspace

Dariia Porechna
Autonomys Network
16 min readApr 12, 2023

--

Special thanks to Jeremiah Wagstaff, our co-founder, and Dr. Chen Feng, our Senior Research Consultant, the key contributors of the new protocol design. I would also like to acknowledge our Affiliate Research Partners, Joachim Neu, Barak Shani, and Shashank Agrawal, for their invaluable contributions.

This article is the third episode of our blog series titled “The Hitchhiker’s Guide to Subspace,” introducing the new Subspace consensus called dilithium. This blog series aims to share insights into our intuition behind construction and design choices.

The first episode of the series discusses polynomial Proofs-of-Replication for Subspace v2 and the usage of Reed-Solomon (RS) codes and Kate-Zaverucha-Goldberg (KZG) commitments, as well as their benefits and challenges. The second episode introduces schemes combining KZG and RS in a distributed data storage context and provides an overview of existing constructions such as Polygon Avail and Ethereum Danksharding. I strongly recommend reading these two articles first for background information.

Presenting Dilithium

Dilithium is a new protocol that combines the underlying proof-of-space from the Chia protocol with erasure coding and KZG commitments. Their synergy produces a lightweight, secure, and energy-efficient proof-of-archival storage (PoAS) consensus variant. The protocol represents a significant step forward in security and user experience for the Subspace Network participants. I am excited to share our progress with you.
For those familiar with our initial consensus design, dilithium fulfills all the fundamental ideas described in the original whitepaper but implements them better.

The protocol is currently undergoing a formal security audit by Security Research Labs. While some details may change following the audit, we are confident in its structure.

Additional Materials

This article will provide a high-level overview and the essential ideas behind dilithium without delving too much into the intricacies of the implementation.

However, if you are interested in the complete technical details of the protocol, consider reading the Consensus Specification. For a more academic approach, see the research paper draft by Dr. Chen Feng. This paper is the culmination of countless hours of research, analysis, and discussions, and we believe it will be an important contribution to the industry when published. Both documents, however, remain works in progress as we are still polishing the details and optimizing the numeric parameters following benchmarks.

If you prefer audio-visual content, we have also recorded a comprehensive presentation for our engineering team that provides more technical details than this article. You can access the presentation here.

We hope this article helps you understand the dilithium consensus protocol and how it differs from the previous version. If you have questions or discussion points, please don’t hesitate to contact us on the research forum.

Proof of Archival Storage

In Subspace, we define our consensus protocol to be Proof-of-Archival-Storage based on the following:

  • A Nakamoto (or longest-chain) consensus protocol
  • Employing a proof-of-capacity resource puzzle for space-bound Sybil resistance
  • The space reflects some useful storage (as in Proof-of-Replication)
  • And the specific data being replicated is the archival history of the Subspace chain

In its simplest form, our Proof-of-Archival-Storage consensus is a 3-phase protocol:

  1. Archiving phase: given new blocks of the chain, construct canonical history.
  2. Plotting phase: given the canonical history of the blockchain, generate a unique replica (the plot) and store it on disk.
  3. Consensus phase: given a challenge from a secure randomness beacon, audit the plot for a solution that satisfies some threshold, return a proof, and propose a block.

*I may interchangeably refer to the Consensus phase as the Farming or Audit phase.

To guarantee fair block lottery and uniform history replication, we have the following guidelines:

  • First, farmers should store as many different pieces of history as possible.
  • Second, the more pieces a farmer stores (or equivalently, the more space they pledge to the network), the greater their chances of producing a new block should be.

For a more in-depth discussion, consider reading the technical whitepaper. Note that the technical details mentioned in the paper require updates due to the consensus upgrade, but our goals, purpose, and design rationale remain the same.

Consensus v1 to v2

The Subspace consensus protocol has undergone significant improvements in the past months. The rationale for transitioning from a working consensus v1 to the current version lies in the eagerness to increase security and improve user experience. The most notable improvements to the consensus protocol are:

  • Succinct commitments: the KZG commitment scheme has replaced Merkle trees, resulting in constant-sized small commitments and proofs, even for arbitrarily large data sets, which is a substantial improvement from the logarithmic size of Merkle proofs.
  • Short proofs: the proof of winning block solution produced by farmers is now much shorter and efficiently verifiable, which was an essential upgrade to facilitate Subspace scalability.
  • Verifiably correct erasure coding: the KZG scheme enables the verifiability of correct erasure coding and data recoverability. This feature flows naturally from combining KZG and Reed-Solomon codes, which is significantly more complex to achieve with the Merkle proof scheme.
  • Incompressible plots: farmer plots are now incompressible since it is impossible to compress statistically close to random data substantially.
  • Eco-friendly farming: farming plot audits are designed with SSDs in mind by favoring the random read approach vs. large sequential reads and optimized for typical SSD read size. This approach is more favorable for smaller home farms with commodity hardware and dramatically reduces the power consumption required for stable farming.
  • ASIC-resistance: the Plotting algorithm was changed from compute-bound SLOTH to memory-bandwidth-bound Chia proof-of-space, providing better resistance to ASICs. This choice also contributes to our eco-friendliness goal by reducing energy requirements for plotting.
  • Decreased network load: the size of history pieces has significantly increased from 4KiB to 1MiB, reducing network requests and computation required to recover data.

This list does not include everything; it only highlights major improvements to the Subspace consensus protocol. The enhancements are numerous throughout the stack and will be subject to dedicated posts.

KZG commitments and Reed-Solomon codes

Two primitives we use extensively are KZG polynomial commitment scheme and Reed-Solomon erasure codes. Our Protocol Engineer, Özgün Ozerk wrote an excellent explanatory primer on KZG, and our co-founder, Jeremiah Wagstaff, dove deep into both primitives in Episode I, so I will highlight the main points:

  • The source data is split into 254-bit chunks. This vector of chunks is uniquely represented as a list of (x, y) point-value pairs on a polynomial p(x) via Lagrange interpolation. The x coordinate is the chunk index in the data vector, and y is the integer value.
  • Reed-Solomon erasure-coding in this context means evaluating the polynomial p(x) at twice as many points. So, for example, if the source data consists of 128 chunks indexed (0, …, 127), then the extended data will consist of as many chunks evaluated outside that range.
Erasure coding by polynomial extension
  • A KZG commitment is a single evaluation of p(x) at a point from a public trusted setup.
  • A KZG witness w proves that a polynomial p(x) evaluates to a particular value y at a particular point x for the commitment C.

The commitment and witnesses are possible to compute if and only if the prover knows the polynomial (the source data), but they are easy to verify by anyone. Moreover, commitments and witnesses have a constant small size (48 bytes) for any source data size, an attractive feature that has played a part in KZG adoption in the industry.

The synergy between KZG and RS allows us to have:

  • succinct commitments to data of arbitrary size
  • succinct witness of the inclusion of data fragments in blockchain history
  • efficient verification
  • provably correct erasure coding

Episode II is a deep dive into the properties and trade-offs of different ways to combine KZG and Reed-Solomon codes. It also covers the fundamental principles of Archiving and Plotting protocols design.

Archiving

Archiving is one of the three core processes in the Subspace consensus. It transforms the chain blocks at a configured depth from the tip of the chain into canonical history, ready to be distributed to farmers for storage.

When a block is confirmed, its contents are added to a raw chain history buffer. Then, the buffer is sliced into records. A record of blockchain history is a vector of 2¹⁵ chunks. A chunk is the smallest atomic unit of data measurement in Subspace. A chunk is a field element for KZG and is 254 bits in size. In most cases, we round it up to 32 bytes by padding with zero for convenience. I will mention chunks often because many of the operations in the protocol are performed over field elements.

Piece = record || KZG commitment C || KZG witness w

A piece is a record concatenated with a KZG commitment and a witness of membership in a specific segment.

The Archiving process produces segments of pieces.

Archived segment structure

Key Idea

The Archiving construction is primarily based on the paper Information Dispersal with Provable Retrievability for Rollups described in the previous episode.

The first key idea we inherit from this paper is to view the data to be archived as a matrix rather than a simple flat array.

The second key idea is what we refer to as a “1.5D approach” with column-wise erasure coding and row-wise KZG commitment, which proves to be surprisingly robust compared to the naive “flat” approach.

Combining the polynomial nature of the Reed-Solomon erasure code and KZG with the homomorphic properties of KZG guarantees consistency, retrievability, and efficient verifiability of the archived data.

Deep Dive

The blockchain history data is eligible for archiving when it reaches the confirmation depth to ensure no forks and reorganizations. We currently archive segments of raw blockchain history of size 128 MiB, and the archiving process is triggered as soon as there are enough blocks at the current confirmation depth of at least 100 blocks from the tip to fill a segment. The Archiver performs the following steps:

  1. Slice the Recorded History Segment into 128 source records, stacked on top of one another into a matrix so each row is a record.
  2. Commit to chunks of each source record (row) are under the KZG vector commitment scheme.
  3. Erasure code each column by interpolating a polynomial over the source record chunks in that column and evaluating that polynomial on twice as many points. As a result, the matrix now has twice as many rows — 256 and consists of 128 source and 128 extended records.
  4. Erasure code the source record commitments similarly by interpolating a polynomial over the source record commitments in that column and evaluating that polynomial on twice as many points.
    Step 4 allows us to show that the erasure coding of data was performed correctly: the secret ingredient is the homomorphic property of KZG. As a result, the extended commitments (the yellow ones on the diagram below) obtained by erasure-coding the source commitments are the same as if we were to commit to the extended rows.
    After step 4, the Archiver has produced 256 records and 256 commitments to those records.
  5. To tie them together into a segment, commit to the record commitments and obtain the segment commitment.
  6. Compute a witness for each record commitment inclusion in the segment commitment.
    With this two-tier commitment, we can later show that a given chunk belongs to a specific record and that record belongs to an archived segment and, thus, to the canonical history of the chain.
  7. Build 256 pieces: each piece consists of a record of 1 MiB, a 48-byte commitment to record data and a 48-byte witness of inclusion in a segment.
  8. Append the new pieces to the canonical history and store the segment commitment in the chain state.

The segment commitment is also included in the successive segment to link back to the last segment.

Archived Segment construction (detailed). Green squares denote source record KZG commitments; yellow squares denote erasure-coded commitments.

History Post-processing

Archiving is part of a more extensive, continuous process of post-processing history that happens alongside the chain growth. Here’s a brief overview of how it works:

  • Once we have enough new history 100 blocks deep from the tip of the chain, it is archived into a segment, as described.
  • The resulting pieces are then distributed to farmers for storage.
  • The segment commitment is added to a state table of all archived segment roots.
  • The segment commitment is added back to history by being included in the next segment produced.

This process continues indefinitely, perpetually and consistently preserving the history of the chain.

Plotting

Once a segment has been archived and the pieces are ready for the farmers to store, the next phase is Plotting.

Plotting is the process of creating and maintaining plots on a disk. During this phase, the pieces are gathered and organized into a plot of several sectors. Each sector contains an encoded replica of a uniformly random sample of pieces across all archived history. This sampling ensures that the data is distributed among the farmers proportionally to their pledged disk space and replicated evenly.

Farmer plot sector structure

The Plotting protocol used in dilithium has undergone a complete redesign from v1. The new algorithm is based on two core ideas: erasure coding and memory-bandwidth-bound encoding. Erasure coding helps protect the data against loss in the event of failures and network partitions. Memory-bandwidth-bound encoding is a more eco-friendly and economical alternative to proofs-of-work while providing provable time/memory trade-offs and security guarantees. Combining these two ideas allows us to create unique and provable replicas for each farmer that are difficult to fake with computation or to compress. This scheme also makes auditing and verifying the plots easier, ensuring the history data is recoverable.

Key Idea

The memory-bandwidth encoding construction comes from the paper Beyond Hellman’s Time-Memory Trade-Offs with Applications to Proofs of Space, which predates the Chia protocol. We adopt a custom implementation of the Chia Proof-of-Space plotting function as a memory-bandwidth-bound function to encode, or “mask,” the pieces in the farmer plot.

In short, the PoS plotter generates a table of permuted outputs from a set of random functions. The size of the table is determined by a memory bandwidth requirement parameter, k, and the random functions are determined by a seed. When challenged at an index, the table outputs a short proof-of-space that can be efficiently verified.

To simplify, consider the PoS table as a black box. You can request a proof value for any index, and the table will provide one.

Structure of Chia PoS table

We do not use the Proof-of-Space directly to verify that a farmer has pledged a certain amount of space as Chia does. Instead, we use it to prove that a farmer utilized the required memory bandwidth for encoding the plot.

Deep Dive

A plot can cover the entire disk or span across multiple disks, and there is no limit to the amount of storage a farmer can pledge to the network. Plots consist of equally-sized sectors, currently around 1 GiB each. Each sector is a pseudorandom selection of 1,000 pieces, uniformly sampled throughout history up to that point. For example, if the farmer creates a new sector when the history consists of 4,000 pieces, the 1,000 pieces for this sector will be a uniform selection from the existing 4,000.
In addition, the farmer must save the current history size, as it will determine the point in the future when the sector will need to be updated with newer pieces.

Verifiable Raw Sector Construction

Once the farmer obtains all 1,000 pieces for this sector from the network, they can create an encoded replica. Only the piece’s historical data, the record part, is encoded. The commitment and witness included in a piece are saved separately in the sector metadata, as they will be needed later for farming.

Each record is encoded separately and sequentially. For each record, the Plotting algorithm performs the following steps:

  1. Derive a unique pseudorandom and verifiable seed.
  2. Based on this seed, generate a proof-of-space table using memory bandwidth resources set by the global protocol memory requirement parameter k. This memory-intensive computation prevents malicious farmers from creating replicas after the new block challenge is announced, making it more rational for them to store the replica rather than try to compute it on the fly every time.
  3. Given the PoS table for this record, the farmer must:
    a. Derive a starting index for lookup, which will also serve as the starting index for the extended (or erasure-coded) record.
    b. Query the table for enough (2¹⁵) proof-of-space values to mask every chunk of the record.
Challenging the PoS table for proofs

4. Erasure code (extend) the record data by interpolating a polynomial over chunks of the record and evaluating it over the lookup indices for the PoS table from the previous step.

5. Encode each extended record chunk by XOR-masking it with the corresponding proof-of-space value.

Record extension and encoding process

After all records in the sector have been encoded as described, the farmer spreads them into s-buckets chunk-wise. Ultimately, each bucket will contain chunks from all records. The first bucket will have the first chunks of each record, the second bucket will have the second chunks, and so forth. The s-buckets are then written to disk, and the plotting process is complete.

Writing encoded records plot to disk

Each bucket represents a potential winning ticket in the block proposer lottery. For each challenge, a farmer will scan one s-bucket containing one chunk of each record they store in a sector and see whether any of them are eligible to win a block.

As a result, a farmer has a unique encoded replica that is difficult to compress or compute on demand. An economically rational farmer is incentivized to store as many honestly encoded replicas as possible to maximize their chances of winning a block.

Farming

Farming is participating in the consensus by solving a puzzle based on a previously created plot.

The plot audits are designed to be SSD-friendly by favoring the random read approach vs. large sequential reads and optimized for typical SSD read size. For instance, a farmer only needs to read ~32KiB at a random location per sector to check for a solution, and they only need to read more if their solution is eligible to win the block.

Farming process

After the farmer plots at least one sector, they can begin farming. The farmer derives their local challenge from the global randomness beacon in each time slot. As soon as they observe a new challenge, the farmer audits each plot sector for a solution and proposes a block if they find one, as follows:

  1. For each sector, derive a corresponding audit index.
  2. Read the s-bucket from the disk at that index.
  3. Scan it for a chunk within the acceptable solution range from the block challenge.

Remember that each bucket contains chunks from all records. Therefore, if the farmer honestly stores all encoded records in the sector, they have the highest chance of success.
Suppose the farmer finds a winning chunk. Then they perform the following steps to prove their solution is valid:

  1. Identify which record the winning chunk belongs from its index.
  2. Read all other encoded chunks of this record from the disk.
  3. Compute the corresponding proof-of-space table the same way as during Plotting.
  4. Query the table for proof values.
  5. Decode the record by XORing it with those proofs.
  6. Compute a KZG witness that the winning chunk belongs to a piece in the archived history.

Finally, the farmer can propose a block with the attached solution and proof of a valid solution.

Updatable plots

As the chain grows, we need a way to ensure that new data is replicated as much as the older data in the blockchain history. To keep the replication factor constant, the farmers must periodically update their plots by repopulating sectors with a new selection of pieces. Recall that when plotting a sector, the farmer saves the current history size, and it determines a point in the future when the sector will expire. When a sector reaches its expiry point, the block proposer challenge solutions coming from this sector will no longer be accepted by other peers, incentivizing the farmer to update their plot. They erase the expired sector and repeat the Plotting process anew, replicating a fresh history sample.

Sector expiration and replotting

In a whole plot, the sectors will be updated randomly, one at a time, so replotting is amortized over a long period. There is never a moment when a farmer needs to erase and re-create their whole plot and miss out on challenges. The plot refreshing will be nearly unnoticeable to the farmer and allow their participation in consensus.

Next steps

The implementation of dilithium is underway, and the deployment to the Gemini testnet is expected in the coming weeks.

Once deployed to the Gemini testnet, the Subspace team will be able to gather valuable feedback and fine-tune the protocol before launching the final version. This feedback loop will ensure the protocol is robust, efficient and secure when it goes live on Mainnet Beta. We are excited about the progress made so far and look forward to the upcoming public deployment.

In parallel, we are working fervently with Security Research Labs and Supranational to conduct a comprehensive audit of the entire security and ASIC resistance of Plotting. This crucial step is paramount in guaranteeing that the protocol is secure and its most efficient and effective implementation, thereby ensuring the optimal use of the farmer hardware dedicated to the network.

In addition, we are excited to announce that we will be sharing several open research opportunities on our Research forum for collaboration with the wider academic community. Such collaboration will enable us to leverage the wealth of knowledge and expertise of scholars from different fields to advance our research and development efforts.

Furthermore, we are committed to exploring other avenues for partnership and collaboration with various entities in the industry that share our vision and mission to ultimately create a more robust, secure, and open ecosystem for all farmers and builders, so please don’t hesitate to reach out if you have any questions, suggestions or concerns!

--

--