Hard-drive space as a tangible, scarce Sybil resistance resource: the PoST protocol
Proof of Space-Time (PoST) is a new cryptographic primitive, designed to replace Proof of Work (PoW) as a proof-of-resource-consumption scheme, and so as a Sybil-resistance mechanism for permissionless cryptocurrencies. It was introduced in 2016 by Tal Moran and Ilan Orlov in a paper titled “Simple Proofs of Space-Time and Rational Proofs of Storage.”
A PoST allows a prover to convince a verifier that they expended a certain amount of a “space-time” resource: a specific amount of space allocated over a specific time period, during which the space cannot be used for anything else. PoST requires significantly less energy than PoW, since the “difficulty” can be increased by extending the time period over which the space is allocated, without introducing additional ongoing computational work.
PoST is a two-phase protocol. In the first phase, the initialization phase (executed once), the prover commits to generate and store some pseudorandom data, derived from a specific seed. In the second phase, the execution phase (executed periodically), the prover proves that they still have access to the data at some particular point in time.
Unfortunately, the protocol does not allow a prover to prove they stored the data, since an adversary can instead store only the initial seed (which can be used to regenerate the data). To overcome this problem, generating the data involves computational work, which can be parameterized such that the cost of regenerating the data is much higher than the cost of simply storing it.
This leads to the definition of the PoST resource as a trade-off between space-time and computational work, where under reasonable cost assumptions, a rational user will always prefer to use the lower-cost space-time resource over computational work (while at the same time a non-rational user has greater cost and does not gain any advantage). It means that there may be a strategy which does not require a prover to expend space over time, but if the parameters are set correctly, this strategy will be significantly more expensive.
For this reason, using PoST in the context of a cryptocurrency, where profit is the main motive for participation, we can guarantee that rational users would prefer to use the space-time resource, so it can be claimed that the cryptocurrency is energy efficient. However, if the initialization cost became lower than the cost of storing the data due to fluctuation in the market price of compute power relative to storage, the protocol essentially devolves to PoW. To avoid that, a market-based mechanism can be used to determine the difficulty level, similar in spirit to the difficulty adjustment in PoW-based cryptocurrencies such as Bitcoin.
PoST vs PoS
Two very similar proposals for Proof of Space (PoS), with similar intentions, have been previously published (1, 2). But unlike Proof of Space, the PoST definition takes into account amortization attacks (using the same space resource for different proofs), rationality of using the space-time resource over CPU work, and an explicit time element (ensuring that the space is committed for a fixed length of time). Moreover, PoST uses a different and much simpler technique, making use of the fact that we explicitly allow a space-time tradeoff, and only relies upon the security of cryptographic hash functions.
Unlike existing Proof of Space constructions, the computational difficulty of the PoST initialization phase can be adjusted without linearly increasing either the amount of space or the verification cost. Hence it is possible to use it to prove that a reasonable amount of storage has been committed over a long period of time. In addition, in the case of an adjustment in difficulty (due to CPU and storage market relative price fluctuation), PoST supports simple incremental difficulty adjustment — that is, already-initialized users only have to pay the marginal work cost between difficulty levels, instead of rerunning the entire initialization phase from scratch. A downside of PoST is that the prover must read its entire storage for every proof, so the algorithm requires linear time, while in Proof of Space this is not required, so the algorithm runs in polylog time. This makes Proof of Space suitable for producing proofs in short intervals (e.g., every 10 seconds), while PoST is suitable for long ones (e.g., weeks or even months).
Construction
To allow public verifiability, and to ensure that the data generated in the initialization phase is cheaper to store than to regenerate, generating this data must entail PoW. To avoid the prover having to send all of the data to the verifier, the prover doesn’t construct one large PoW, but rather a table containing many entries, where each entry is a PoW that can be independently verified.
Phases
- Initialization — the prover performs computational work that results in pseudorandom data, generated with respect to a specific seed, organized as table entries. The seed may be a public key, so that an adversary which does not hold the corresponding private key cannot claim custody over the same data.
- Execution — the prover receives a challenge that cannot be predicted, reads (or recomputes) the entire data, and generates a proof in response to the challenge to prove that the data exists at the time of invoking the algorithm.
The execution phase can be repeated multiple times without rerunning the initialization phase. This point is critical, since the initialization phase requires work, while the execution phase is energy efficient (i.e., it requires relatively few computational steps). Thus, although using PoST to generate a single proof does not give any advantage over PoW, using it to generate many such proofs allows the work required during the initialization phase to be amortized over all of the proofs, which makes the construction energy efficient.
Proving
Proving occurs on every iteration of the execution phase, as a response to a received challenge, and at the end of the initialization phase, when the challenge is empty. This makes proving interactive between the prover and a verifier, except during the initialization phase, where the challenge is fixed (i.e., empty) and non-interactive.
The time element of the space-time resource is the elapsed time between successive phases of generating, and chaining, the PoST: the initialization phase, or the previous execution phase, and the latest execution phase. Unlike the space element, the time element is not publicly verifiable (i.e., depending solely on the content of the proof), but rather requires a verifier to measure the elapsed time between phases. However, it can be made non-interactive and publicly-verifiable by adding a Proof of Elapsed Time (PoET) to the construction.
To generate a proof, the prover commits to the data by constructing a Merkle tree whose leaves are labeled with the table entries. Each leaf label contains concatenated group of entries, and each internal node’s label is the output of a hash function on the concatenation of its children’s labels, salted with the challenge. This forces the prover to commit to its knowledge of the leaves at the time it generates the proof.
Once constructed, the prover needs to provide the opening of the Merkle tree commitment (Merkle proofs, i.e., Merkle paths to the root of the tree) for a random set of 100 entries. The indices for these entries can be selected by the verifier as a response to the prover commitment (Merkle root). But this option isn’t used, because it can easily be made non-interactive by using the Fiat-Shamir heuristic: the Merkle root hash is used to deterministically derive the random set of indices.
The prover cannot merely generate the data for the selected entries and skip all others, because they are forced to commit to the entire table and to a specific Merkle root before the indices are known. If they happened to “skip” an entry which will end up being selected, the proof would be invalid. This means that the prover is forced to either generate a large fraction of the table, or spend a lot of computational work trying to find a Merkle root that will produce a “good” set of indices. By setting the parameters correctly, we ensure that in the second case the amount of work the prover must do is more than the initialization cost.
The publicly-verifiable proof contains the Merkle root, the data for the selected entries from the PoW initialization phase, and their commitment openings (Merkle proofs). Common labels from the 100 distinct Merkle paths are eliminated to reduce proof size.
Verifying
The verifier first derives the same pseudorandom set of indices for 100 entries from the Merkle root, similarly to the prover, using the Fiat-Shamir heuristic. Then she computes the Merkle root from the commitment openings to verify that it matches the expected root value. Then she extracts the chosen entries from the proven leaves (each containing a group of entries), and verifies that the PoW is valid for their respective indices and seed.
Why generic PoW isn’t sufficient
Generating the data in the initialization phase involves computational PoW, but the standard definitions of PoW do not rule out an adversary that can store a small amount of data and use it to regenerate the entire data set with very low computational overhead. Thus, to ensure the prover must indeed store the entire data set, the protocol is constructed using the abstract notion of incompressible proof-of-work (IPoW), a more restrictive variant of PoW.
The most popular form of PoW, the Hash-Preimage PoW, was introduced in Hashcash and is being used in Bitcoin and Ethereum. It involves finding a certain value (called a nonce) that, when added to some fixed data (e.g., the contents of a block, known here as the preimage) and hashed, satisfies a certain difficulty criteria. This is not yet an incompressible PoW: although compressing the hash output is impossible, compressing the preimage isn’t. By constructing the optimal compression scheme and instructing honest users to use it, it can be turned into IPoW.
However, the Hash-Preimage IPoW sets an upper limit on the amount of storage we can fill for a given initialization cost (number of hashes): with a predetermined sequence of values, one just needs to store a single bit for each item, indicating whether the item is a “good” preimage or not. Thus, due to the compression scheme, each hash invocation can “contribute” at most a single bit to the total storage.
The Partial-Hash IPoW
In order to be able to fill more space without increasing the initialization cost, a different IPoW is needed. Partial-Hash IPoW is a simple solution where the amount of work for every IPoW is always a single hash invocation, but the amount of storage the hash output would fill is adjustable (from 1 bit up to the entire output length). It is considered incompressible because, although the hash input might be smaller than the storage the output fills, the required CPU work for regenerating the output cannot be made cheaper, since it is already just a single hash invocation.
Verification of the Partial-Hash IPoW also requires a hash invocation, which makes it equal in cost to the proof generation. This makes the Partial-Hash IPoW highly non-desirable for any construction besides ones in which the prover proves many distinct IPoWs, while the verifier is required to verify only a small portion of them — as in PoST.
PoST deterministic nature
The entire PoST construction, from the Partial-Hash IPoW for generating the data to the Merkle tree commitment and openings, is completely deterministic: the number of required hash invocations for any given space-time resource is known in advance. This makes PoST different in nature from PoW, where successfully generating a proof is probabilistic. In contrast, PoST allows us to guarantee that anyone spending the required resources is eligible to produce a valid proof. Thus PoST can be used to replace the lottery-based mechanism for block production (and so for rewards) in PoW with a deterministic one, and doesn’t make leader-election a required part of the consensus protocol. This means that PoST can be used to create a race-free cryptocurrency protocol.
Related links
- Tal Moran presentation: https://www.youtube.com/watch?v=Hkdm0dLKZQI
- Spacemesh implementation: https://github.com/spacemeshos/post
Special thanks to Lane Rettig, Barak Shani, Tal Moran and Yael Hoffman for feedback and editing!