FileDrive Meetup #1: What is FIL+? How to store is PiB-level data on Filecoin?
On March 30, the first event of FileDrive Online Meetup came to a successful conclusion.
In this event, team members shared and discussed popular topics in the Filecoin ecosystem.
If you missed the live event, check out this review article!
Filecoin Plus is a practical solution designed to enable the needs of the Filecoin network by adding a social trust layer and introducing a new identifying resource called DataCap, in order to make the network more decentralized and improve the quality of user storage.
Recently, the FIP-0019: Snap Deals proposal applied in the v15 network upgrade and the third round of notary elections have brought the attention of participants in the community to Filecoin Plus once again.
In this Meetup, Laura, the core member of FileDrive Labs, shared with us the news and latest progress of Filecoin Plus.
Recently, the Filecoin Foundation released the Modification: Notary selection process — selecting notaries based on regional demand.
This proposal aims to allow more Notaries from different regions to participate in the distribution process of DataCap and promote the improvement and development of the network.
Laura also answered several common misunderstandings about Filecoin Plus in her sharing.
PiB-level data storage
To successfully load large data into the Filecoin network is still a multi-step process with high technical content and complexity.
For customers with PiB-level datasets, it is difficult to process such volumes of data and store them successfully using existing tools and software services.
It also makes the process of data onboarding challenging for large clients and becomes an obstacle to the implementation process of Filecoin as a storage solution.
In the developer sharing session of this Online Meetup, Felix, the core developer of FileDrive Labs, gave a wonderful technical sharing on the topic of PiB-level data storage.
-How to use Filecoin to store PiB-level data?
In the Filecoin network, data is finally encapsulated in sectors in the form of CAR files and stored on the network.
The full name of the CAR file is Content Addressable aRchives, which is the serialization of the underlying data structure of IPFS.
The underlying data of IPFS can determine the unique identifier according to the data content. The data represented by a single order can be determined by the unique identifier of the content. It is convenient to determine the consistency of the data to be stored by the user and the data actually stored by the service provider through the blockchain.
At the same time, storing data in the form of CAR files to the storage service provider can form an interactive relationship with the IPFS network.
Although there are still a few technical difficulties in data storage on the Filecoin Network, it’s no doubt to be overcome in the future.
For storage service providers in the Filecoin network, 32GiB or 64GiB sectors are usually used as their storage units.
However, each storage order stores one piece of data, and its size cannot exceed the storage provider’s sector size (32GiB/64GiB).
Therefore, for PiB-level data, the urgent problem that needs to be solved to store it in the Filecoin network is how to divide the data to fit the sector size of the storage provider.
-How do storage providers split the dataset into multiple blocks that can meet the sector size?
1. By packaging tar and then splitting
- The processing method is relatively simple and easy to operate.
- Resources and time are consumed in the process of tar packaging and segmentation.
- When storing to Filecoin, data import is still required.
- To retrieve data from a tar package, whether large or small, it must first fully retrieve the whole package.
2. Go-Graphsplit developed by FileDrive Labs
- The split data is in CAR file format, and no import process is required.
- The directory structure of the original dataset is basically preserved, and partial data retrieval is supported.
- The sharding process completes the conversion of the data format, so it requires higher memory resources of the computing device.
-About data retrieval
Fast Retrieval allows data to remain a copy in addition to being encapsulated into sectors and be retrieved without unpacking sectors.
However, data retrieval straightly from the Filecoin Network will take a long time, and the amount of data retrieved at one time is large, which is more suitable for cold data.
To support hot data, it generally tends to use the IPFS Network. But IPFS nodes are limited by the upper limit of the storage of a single physical machine.
FileDrive Labs developed the Go-DS-Cluster base on the IPFS stack to solve this problem, which can support the horizontal expansion of the storage area of IPFS nodes and make the storage capacity not limited by the physical storage of a single device.
It aims to gather distributed key-value datastores to become a cluster to increase the data split function and improve data management.
Even only one IPFS peer could take advantage of decentralized storage without the limitation of I/O by a single PC.
Finally, in the Q&A session of the Online Meetup, the participants and contributors in the community also actively raised their questions and discussed, which formed a great interactive atmosphere.
In the future, we will hold the FileDrive Online Meetup every month to share the latest developments within the Filecoin network.
As a technical team actively building tools, applications, and infrastructure in Web3, we welcome anyone interested in Filecoin and the FileDrive to join us and work with us to help the development of the Filecoin ecosystem!
The following is the full video link of this FileDrive Online Meetup: