InterPlanetary File System

Abhilash Krishnan
3 min readJun 28, 2024

--

The InterPlanetary File System (IPFS) is a protocol and network designed to create a peer-to-peer method of storing and sharing hypermedia in a distributed file system. Here’s an explanation of how IPFS works, its architecture, and how it stores files:

How IPFS Works:

Content-Addressed Storage:

IPFS uses a content-addressed system where files are identified by their content, rather than their location. Each piece of content is given a unique cryptographic hash (CID — Content Identifier), which serves as its address on the IPFS network.

Distributed Hash Table (DHT):

IPFS utilizes a DHT to store records of which peer is storing which content. This allows any node in the network to find the peers that are storing a specific file based on its CID.

Peer-to-Peer Networking:

IPFS forms a distributed network of nodes where each node (peer) stores some content and also helps in distributing content to other nodes upon request. Peers can request content from other peers directly, making the network robust and decentralized.

Versioning and Deduplication:

IPFS supports versioning of files. When a file changes, a new CID is generated for the updated file, ensuring version history and preventing duplication of unchanged content across the network.

File Retrieval:

To retrieve a file, a node sends a request containing the CID of the file to the IPFS network. The network locates the peer(s) storing the file using the DHT, and the file chunks are fetched directly from those peers.

IPFS Architecture:

IPFS Node:

A node in the IPFS network can be any device running IPFS software. It can store files, retrieve files, and participate in the network by connecting to other nodes.

IPFS Objects:

Files in IPFS are split into smaller chunks called blocks. Each block is uniquely identified by its hash. Multiple blocks form an IPFS object, which is identified by the root hash of its Merkle DAG (Directed Acyclic Graph).

Merkle DAG:

IPFS uses a Merkle Directed Acyclic Graph to structure data. This allows for efficient storage and retrieval of large, complex structures by linking related data together.

Content Addressing:

Every piece of data in IPFS, whether it’s a file or a block, is identified by its content hash (CID). This ensures that identical content results in the same CID, enabling deduplication across the network.

File System Abstractions:

IPFS provides familiar file system abstractions like directories and files using the Merkle DAG. Directories are represented as objects that contain links to other objects (files or directories).

How IPFS Stores Files:

Chunking and Deduplication:

When a file is added to IPFS, it is broken down into smaller chunks. Each chunk is hashed to generate its CID. Identical chunks across different files or versions are deduplicated, ensuring efficient use of storage.

Merkle DAG Construction:

IPFS constructs a Merkle DAG for each file and directory. Each node (block or object) in the DAG points to its children via hashes, forming a structure that represents the content hierarchy.

Distribution Across Peers:

Once a file is added to IPFS, the chunks are distributed across the network to different peers. Each peer stores some chunks and can retrieve missing chunks from other peers upon request.

File Retrieval:

To retrieve a file, a node requests the root CID of the file’s Merkle DAG from the IPFS network. The network locates the peers storing the chunks and retrieves them to reconstruct the file locally.

Benefits of IPFS:

  • Decentralization: IPFS removes the need for centralized servers by allowing files to be stored and retrieved directly between peers.
  • Efficiency: Deduplication and content-addressing reduce redundancy and enable efficient storage and distribution of data.
  • Resilience: Files stored on IPFS are resilient to censorship and single-point failures due to the distributed nature of the network.

In summary, IPFS revolutionizes file storage and sharing by leveraging decentralized peer-to-peer networks, content addressing, and Merkle DAGs to create a more efficient, resilient, and censorship-resistant alternative to traditional centralized file systems.

--

--

Abhilash Krishnan

Entrepreneur | Technologist | CTO | Delivery Lead | Software Architect | Distributed Systems | Machine Learning | Deep Learning | Generative AI | Blockchain