Engineering Incentives for Data Storage as a Commodity: Filecoin

Caleb Tuttle
OpSci
Published in
7 min readDec 10, 2021
Photo by Shubham Dhage

This article is part of an on-going series on decentralized file storage for science introduced in Rich in Data, Poor in Wisdom: Science Needs a Decentralized Data Commons.

Filecoin is a disruptive technology that uses incentive engineering to drive down the cost of data storage through crowd sourced storage providers. Its incentive engineering also reduces bad actors and low quality data while increasing availability, resilience, and accountability. By creating an open marketplace for data storage, Filecoin has provided a solution that is (as of writing) only 0.02% the cost of Amazon S3 — Infrequent Access. Here we outline the mechanisms that power this blockchain-based storage and retrieval market. We discuss the basic kinds of Filecoin actors and interactions before diving into Filecoin’s unique proofs, a depiction of how these mechanisms come together, and the Filecoin+ incentive layer.

Types of Actors and Interactions on Filecoin

There are two types of participants on the Filecoin network: clients and miners. Clients pay storage miners for storage services and retrieval miners for retrieval services. Miners compete with each other to provide these services. Filecoin facilitates exchanges between these two types of actors even when all parties are anonymous. These actors interact through both the Filecoin blockchain and off-chain channels.

Most interactions occur on the Filecoin blockchain, so these interactions are transactions. Interactions in the retrieval market happen off-chain and include payment channels. Each transaction on Filecoin is either a simple transfer of FIL (the cryptocurrency local to the Filecoin blockchain), an ask order, a bid order, or a deal (source). Miners create ask orders in which they offer services for certain prices. Clients create bid orders in which they offer FIL for certain services. When a miner’s ask and a client’s bid match, a deal is created, committing both parties to the exchange. Every deal is either a storage deal or a retrieval deal. “Storage deals are agreements between clients and storage miners to store some data in the network,” while “retrieval deals are agreements between clients and retrieval miners … to extract data that is stored in the network” (source).

How Miners are Incentivized

To ensure miners store what they agree to store, Filecoin requires that storage miners prove via cryptographic interactive proofs that they are storing the data. Proofs are submitted to the Filecoin blockchain so that other Filecoin nodes can verify the proofs. Miners who successfully prove they are storing data they agreed to store are rewarded FIL, and miners who fail are penalized. Filecoin’s success depends on this incentive system and, thus, on the cryptographic proofs that power it. Filecoin utilizes two kinds of proofs to determine whether storage miners are actually storing what they agree to store: Proof of Replication (PoRep) and Proof of Spacetime (PoSt). Below, we provide a brief overview of these proofs.

Proof of Replication

“In Proof of Replication, a storage miner proves that they are storing a physically unique copy, or replica, of the data” (source). In the lifetime of a storage deal, PoRep occurs only once, at the beginning. The purpose of proving that a replica is being stored is to “prevent the Sybil Attack, the Outsourcing Attack, and the Generation Attack” (source). Let us consider these attacks and how PoRep prevents them.

A storage miner successfully executes a Sybil Attack when they convince a client they have stored n copies of some data even while they have stored fewer than n copies. Filecoin’s PoRep addresses Sybil Attacks by requiring every single copy of data received by a storage miner to be uniquely encoded (this uniquely encoded copy is the replica). Thus, no matter how many pseudonyms an attacker has, they must prove they have a unique replica for every copy they claim to store.

PoRep’s solution to Outsourcing and Generation Attacks is the same. A storage miner completes an Outsourcing Attack when they convince a client they have stored some data but have only outsourced the storage. A storage miner completes a Generation Attack when they convince a client they have stored a copy of some data, but they only have a program that can generate the data. To prevent these attacks, PoRep relies on the time it takes to encode data, i.e., the time it takes to produce a replica. Creating a replica takes significantly longer than just sending a replica. If an attacker retrieves the data just in time (through outsourcing or generation), the attacker’s response time is notably longer than that of an honest storage miner. If a storage miner’s response time is greater than the time it takes to encode, the miner fails to prove it is storing the data. This scheme makes Outsourcing and Generation Attacks nearly impossible.

Proof of Spacetime

While PoRep proves something is stored at some specific point in time, Proof of Spacetime proves something is stored over a period of time. One approach to PoSt would be to require a PoRep from storage miners at frequent intervals, say once every minute, but this would take up too much bandwidth. In Filecoin’s PoSt, a storage miner creates a proof-chain: “a verifiable data-structure that chains together a sequence of challenges and proofs” (source). That is, the storage miner sequentially generates Proofs of Storage and uses a hash function to link the proof at time t to the proof at t -1. For any sector of disk space, a compressed PoSt is submitted to the blockchain for verification at least once every 24 hours (source). Because these proofs require significant computation, it is infeasible for a storage miner to retrieve the data just before a proof needs to be submitted. Thus, even while only requiring proofs to be submitted once every 24 hours, Filecoin guarantees that a storage miner is storing content over any period of time.

The Filecoin Ecosystem

Let’s consider how these actors, interactions, and proofs form the Filecoin ecosystem. First, people or organizations with extra disk space become storage miners by pledging disk space to the network through a pledge transaction. A pledge specifies how much disk space the miner has available; it also includes collateral (in FIL), which is returned to the miner as the miner provides proofs of storage but kept from the miner if the miner fails a proof. After pledging, the storage miner creates ask orders specifying the prices for its storage services.

A person or organization who needs these storage services sees the ask orders on the blockchain. The client and storage miner negotiate a deal, which is then submitted to the blockchain. The data is sent to the storage miner, who submits a PoRep and then periodically submits PoSts. In the deal, the client locks up full payment for the service, and this payment is distributed to the storage miner throughout the lifetime of the deal as the miner submits successful proofs. During the deal’s lifetime, the client can get the data by entering a deal with a retrieval miner. When the deal terminates, the storage miner no longer has incentive to store the data.

Engineered Incentive Structures for Storing High Quality Data with Filecoin+

In addition to the base layer of incentivization, Filecoin has a social trust incentive layer — Filecoin+ — that encourages the storage of quality data. This incentive layer introduces a resource called DataCap and a type of actor called a Notary. DataCap is a resource that clients can use to pay storage nodes, and when storage nodes accept DataCap as payment, they “earn better block rewards […] over time.” Notaries are the initial stewards of DataCap. These are select parties who grant DataCap to clients whose storage use-cases are deemed especially valuable. Storage deals that use DataCap are called verified deals. The decentralized storage service Estuary combines Filecoin with the caching of IPFS, and all Filecoin deals done through Estuary are verified deals. Filecoin+ adds value to the Filecoin ecosystem by simultaneously reducing storage costs for some clients who are storing high quality data and increasing payments to nodes who store this data.

Links and Resources to Filecoin Ecosystem

Conclusion

Filecoin is a peer-to-peer cloud storage marketplace where anonymous storage providers and clients can trust each other to keep their promises because of the impressive incentive system built on the Filecoin blockchain. Clients’ and storage miners’ promises are enforced by on-chain deals. Clients can trust storage miners because the miners are rewarded for proving they are storing the data and punished for failing such proofs, while storage miners can trust to be paid because the protocol requires clients to lock in full payment upfront. Filecoin+ improves this ecosystem by incentivizing storage of high quality data. Filecoin can be used by any person or organization who needs to store data.

Join the Decentralized Open Science Movement

Does the idea of a free, open, internet of science ring a resonant chord with you? Consider joining the Opscientia community to learn, connect, and collaborate with others building a commons for co-discovery.

References

Benet, J., & Dalrymple, D., & Greco, N. (2017). Proof of Replication. Protocol Labs. Retrieved November 4, 2021, from https://filecoin.io/proof-of-replication.pdf

Protocol Labs. (2017). Filecoin: A Decentralized Storage Network. Retrieved November 5, 2021, from https://filecoin.io/filecoin.pdf

How Filecoin Works | Filecoin Docs. (2021, November 25). Retrieved November 25, 2021, from How Filecoin Works | Filecoin Docs

Filecoin Tutorial | Verifying Storage on Filecoin (Lesson 3) | ProtoScohol. (n.d.). ProtoSchool. Retrieved November 2, 2021, from Filecoin Tutorial | Verifying Storage on Filecoin (Lesson 3) | ProtoSchool

Filecoin Spec. (n.d.). Retrieved November 6, 2021, from Home | Filecoin Spec

Filecoin Plus | Filecoin Docs. (2021, August 17). Retrieved November 22, 2021, from Filecoin Plus | Filecoin Docs

Originally published at https://hack.opsci.io on December 10, 2021.

--

--