Can cold data be lost on IPFS?

First of all, we have a simple understanding of the data. The data is divided into cold data and hot data. Does the data also have temperature?

In a word, hot data is to access more data. It’s very busy. The cold data is that there are basically no guests to visit, and there are few cars and horses in the front door.

Hot data: online data that needs to be frequently accessed by computing nodes.

Cold data: refers to data that is not frequently accessed by offline classes, such as enterprise backup data, business and operation log data and statistics.

Two different access frequencies lead to different purposes of database construction. There is a simple and clear sentence:

Thermal data is calculated nearby and cold data is stored centrally.

Is it possible to lose cold data in IPFS network?

The problem of data resource loss is the most important pain point of all storage methods. In the traditional data center, there are generally two ways to ensure that data will not be lost.

I. improve the security of data storage

More stable servers (such as Dell’s 730xd, 740xd), one cost more than 50000 (not including hard disk), dual power supply, dual CPU.

A better hard disk, such as an enterprise class helium disk, costs two to three times more than a normal household hard disk on a single t. RAID5 / raid50 can recover the data of the damaged hard disk through parity.

Independent power supply of two power plants, preventing one power plant from failure, with sufficient redundancy.

UPS power supply, to prevent power failure, even higher-level computer rooms will use diesel generators to ensure that the power is still uninterrupted after 12 hours.

II. Save more times

That’s easy to understand. A really important cold data enterprise will not only exist in the cloud. Maybe there are several copies in the company’s computer, several copies in the company’s server, and several copies in each big cloud. In this way, they are less likely to be lost at the same time.

From this we can see that in the traditional storage field, if you want to keep a piece of data from losing, the cost is very high. But the cost is reflected in: if you want to reduce the possibility of data loss, it may cost tens of times or even hundreds of times to protect. At the same time, the more data is stored in the traditional storage mode, the security will also be exponentially reduced. That’s why IPFS is going to revolutionize traditional storage.

What are the benefits of IPFS in storage for cold data?

In IPFS, the mode of erasure coding is adopted, i.e. m + n mode. M is the number of original files, and N is the number of backup files. IPFS will cut and send files to different miners to prevent the local network from paralyzing and the impact on the global file security.

The biggest benefits of IPFS storage are two:

I. file backup will not reduce the overall security

In fact, it’s very easy to understand. In traditional storage, your bank card password is saved at home and in the company. Even if you forget, it’s also possible to find it back, but at the same time, it’s also more likely to be seen by bad people. But IPFS is not the same. No matter how many copies you save, your data security is always the same. It is transmitted on the IPFS node network after encryption.

II. Data security increases with N, but the price remains stable

N is the number of backups. In the probability model of node failure events independent of each other, the increase of n will greatly reduce the probability of failure.

If you have a file, it will be distributed to at least seven people according to the distribution mechanism of IPFS, and this file is not big but very important, so we have saved the file 10 times, so there are 70 nodes to store it. If the probability of permanent loss of each node is 1% (this is only the approximate probability of non permanent loss of power, and the probability of actual permanent loss is much lower than this), what is the probability of file loss?

P = 1 — (1–0.01 ^ 10) ^ 7 = — 7 * 10 ^ (- 20)

How small is the probability?

It’s equivalent to one person winning two 5 million lottery tickets in a row!!

If you think the data is not very important, only 3–5 copies are saved, and the probability of loss is far lower than that of the centralized server.

It was just a small mathematical model. In reality, distributed storage will further reduce the probability of data loss, such as more reasonable data cutting; identify nodes with lower probability of loss through identification; prefer long-lived nodes, reduce the proportion of malignant nodes through incentives and penalties; reduce the cost of unit storage by building more nodes. In a word, there are two principles that will never change. The larger the number of N, the more difficult it is to lose. With the increase of N, there is no loss of security.

As a result, our IPFS achieves cheaper data storage, stronger security and more stable anti loss ability.

IPFS is not the future, so who is?

--

--

Minning More,Be More

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store