Decentralized organizations should have reliable ways to save user and application data. The storage architecture must tolerate network partitions and outages and provide control over who can save, or “pin”, data. It also needs to be easy to deploy and maintain.
One solution is an IPFS Cluster that uses Ethereum smart contracts for pinning authentication. This post goes over how this could work. It’s in the context of the Aragon ecosystem, but we’re working to make the general strategy and code useful to a larger audience.
IPFS Cluster lets IPFS daemons operate like a single unit with a shared pinset. Each cluster node has (among other things) an IPFS daemon and a parallel IPFS Cluster Service, which coordinates the shared pinset, replicates files across nodes, and collects health data.
The IPFS Cluster Service exposes endpoints for operations on the cluster as a whole. For example, you can add a file to the underlying daemon and ask the other nodes in the cluster to add it in one call.
IPFS Cluster Controller is a client application for performing cluster operations via IPFS Cluster Service’s endpoints.
The experimental CRDT implementation of IPFS Cluster allows nodes to come and go freely from the cluster (in the current RAFT implementation, it’s recommended to have a predefined number of cluster nodes). Additionally, it enables nodes to join a cluster as “replication nodes”, which can provide extra storage for the cluster without having the ability to change the shared pinset. At IPFS Camp the record for largest cluster was set at 26.
Running a cluster without additional services satisfies our requirement for robust IPFS storage that survives network partitions and outages. If there’s a divide in the cluster network, the cluster will reach consensus on the pinset once nodes reconnect. If a single node goes down, an application can fetch data from one of the other cluster nodes. Additionally, replication nodes can donate storage space — for example, a person with an empty server at home could join a “climate change cluster” to back up important data. This is a specific use case that is important to our friends at the IPFS Consortium and AVADO.
Teams can develop custom authentication modules on top of IPFS Cluster Service via proxy servers.¹
The proxy server intercepts requests to the IPFS Cluster Service and checks if the request to update the cluster’s shared pinset is valid. Developers are free to implement whatever authentication schemes they choose.
After surveying a few teams, we think authenticating pin requests based on Ethereum addresses is a common use case. Request Network has this need, for example. This is achievable in four steps:
- In the request to pin data to the cluster, the sender includes a signed message.²
- The proxy server hears of the request and unfurls the signature to recover the Ethereum address used to sign the message.
- The proxy server checks a smart contract to see if the Ethereum address used to sign the message is authorized to pin data to the cluster.
- If authorized, let the request pass through to the IPFS Cluster Service. If not, send back an error and do not pin the file.
The Aragon ecosystem has additional needs beyond authenticating Ethereum addresses, which will involve more complex solutions. For example, a DAO might want to give another smart contract permission to pin data (e.g. a token manager contract could have permission to pin), or use the outcome of an arbitrary event to determine pinning permissions. We’re still figuring out the best ways to cater to these use cases, but there are generally two strategies: using Ethereum events and provable EVM scripts. The appendix at the end of this article has an overview of these and how they might work.
We see this architecture as a viable storage option for Aragon DAOs. When configuring a new DAO, users should have the option to set up a custom cluster and tailor authentication to their organization’s specific needs. Organizations wouldn’t have to rely on Aragon Flock teams to host their data or spend hours trying to configure IPFS nodes with custom authentication.
Along with easy setup and configuration, we imagine a client-side application that shows DAO members the health and storage capacity of their cluster, and allows admins to update pinning permissions. IPFS Cluster comes with a lot of built in graphing endpoints, making it easy to display informative interfaces about your cluster. ClusterLabs provides some open source inspiration for these views.
We’re interested in understanding how we can integrate economic incentives into this architecture. For example, an organization running a cluster with extra capacity could rent excess storage to a DAO that isn’t ready to host their own infrastructure. Additionally, we’d love to explore how a Filecoin integration could be useful to DAOs.
We’re researching two alternative ways to authenticate requests via Ethereum — event logs and evmscripts.
One idea, taken from the IPFS Consortium, is to emit an event when you want to pin data. The emitted event should contain a hash representing data to pin. The proxy server would be listening for “Pin” events and upon hearing of one, decode the event logs, and add the cid from the event log to the cluster’s shared pinset.
One challenge with this strategy is handling situations where the cluster attempts to pin a cid of a file that is not yet added to any cluster nodes. This scenario would happen frequently.
For example, imagine our DAO contains a forum application that’s built on Ethereum and IPFS. We want anyone to be able to post to the forum, but we don’t want anyone to be able to freely pin data to our IPFS cluster. This is a unique situation because any user is indirectly allowed to pin their data to IPFS, since it represents a forum post and is coming from the forum application.
To achieve this, any time a new `Post` smart contract method is called, it emits a “Pin” event, which our proxy server is listening for:
Our proxy server hears this Pin event, gets the cid, and attempts to add this cid to the cluster’s shared pinset. But this could cause an issue — the actual file represented by this cid is not added to any cluster node yet. As a result, the cluster node has to find the associated file via the IPFS DHT. If the file is stored on a random local node that’s offline, the content may never get added to any cluster node. Future requests to fetch this content would fail.
We’re researching ways to solve this problem. One is to enhance IPFS garbage collection, so that we have more control over the timing in which blocks are removed from an IPFS node. If done right, an unauthorized user could add (not pin) a file to the cluster for a predetermined period of time. When the proxy hears of the Pin event, one of the cluster nodes will have the file, so the cluster can successfully pin it.
A second method we’re researching would allow clients to construct proofs that the proxy auth server can use to determine if a request is valid or not. To reuse the previous example about an Aragon forum application, imagine if a user could communicate this:
“Hello proxy auth server! I know I’m not normally allowed to pin data to your cluster, but in this particular instance, I’m adding a new post to your forum app. Here’s my proof: [proof]!”
The proxy server knows how to verify the supplied proof, and if all checks out, will pin the data from the user to the cluster, even if this user isn’t normally allowed to.
These proofs are evmscripts. The process of “verifying a proof” is to run the evmscript in a sandbox and see if the transaction would have executed. If the transaction executed, it means the proof checks out, and the data should be pinned. If the transaction reverts, the proof is invalid and the data should not be pinned. This can all be made possible with aragon.js and Aragon forwarders. Alternatively, this proof could be computed via the Aragon ACL.
We loved working with a larger community to develop and refine these ideas. Thank you to Hector and Michelle from Protocol Labs, Brett (Sohkai) and Gorka from Aragon One, Olivier from Aragon Black, Sponnet from AVADO, the entire Autark team, and the many others who have been extremely helpful to us.
: Research is being conducted around dynamically updating the trusted peers array in the IPFS Cluster configuration. This feature could enable a wider range of authentication patterns but would require an additional layer of consensus within a cluster.
: The message to be signed should include the CID to be pinned, along with some type of salt to ensure that any signatures captured over the wire can’t be reused to pin data (or reused to pin CIDs).