Introduction to IPFS: The New & Improved Web

Image for post
Image for post

IPFS (Interplanetary File System) is a versioned file system which can store files and track version changes over time, similar to Git. In addition to storing files, IPFS acts as a distributed file system, much like BitTorrent. IPFS has been coined the “new permanent web” due to the improved properties of file storage and transfer, which enhances the way we use existing internet protocols like HTTP.

What is Internet? And how does IPFS improve it?

The simple question “what is internet?” is one that stumps most people.

In short, the internet is a collection of protocols that describe how data moves around a network. One of the protocols that serves as the backbone of the web is HTTP or HyperText Transfer Protocol, invented by Tim Berners-Lee in 1991.

Since the advent of HTTP over 2 decades ago, developers have made many improvements to the protocol. However, regardless how many times you upgrade an iPhone 3, eventually you have to admit that the newer generations of iPhones are superior. In other words, HTTP is now outdated and, in many ways, inefficient. IPFS hopes to be the new and improved protocol moving forward.

Before we explain how IPFS can improve the internet, we must first explain the problems of HTTP.

· Outdated model of transferring files

· All data needs to be centralized

· Data on HTTP is prone to censorship

1. Location-Based Addressing: An outdated model of transferring files

HTTP operates on a location-addressed protocol which means when you type “download cat photo” into your browser (for example Google), it gets translated into an IP address of some Google server, then the request-response cycle is initiated with that server and spits out a list of cat images in which you can download. When you selected a cat photo to download, you go through the same process again: request and response.

Let’s say you’re in a classroom and all your classmates have to download the same cat photo for an assignment. If there are 100 students, then that’s 100 requests and 100 responses. This is not efficient at all.

Furthermore, HTTP was created originally to transfer small files like text and images. For this reason, HTTP has served the internet pretty reliably for most of its history and has been great for loading websites. In the first two decades of the web, the size of an average web page has only increased from ~2 kilobytes to ~2 megabytes. However, fast forward to 2018, where on-demand HD video streaming and big data are becoming universal; web pages are becoming increasingly large and we therefore need a more efficient way of producing and consuming data.

With IPFS, users are able to leverage physical proximity to more efficiently retrieve the information they need.

Rather than using a location-based address, IPFS uses a content-addressed protocol to transfer content. This is done using a cryptographic hash on a file as the address. For illustration purposes, here’s how a cat file looks on HTTP and IPFS:

An HTTP request would look like this:

An IPFS request would look like this: /ipfs/Qm8xJy7sj8xKJ/folder/cat.jpg

The hash represents a root object and other objects can be found in its path. Therefore, once you know the “hash” of the file you’re looking for, you can gain access to the file directly, instead of talking to a server. In the classroom example, all 100 students can download the cat photo directly from the professor’s computer rather than sending a request to a server 100 miles away. This results in a much more efficient way of transferring files.

In today’s on-demand world, IPFS enhances our ability to consume by providing high throughput, low latency, data distribution. This enables content to be delivered much quicker to websites, making it cheaper and faster to share and consume high-quality content over the internet.

2. Risky Storage Solution: All data needs to be centralized

One of the biggest issues with HTTP is the fact that it relies on centralized data servers to store all its files. Centralizing servers is beneficial because it provides companies with complete control over how fast it can deliver data to users. But if there is ever a problem in the network’s line of communication, the client will not be able to connect with the server. This can happen if an ISP (internet service provider) has an outage, or if the server is experiencing technical difficulties.

The location-based addressing model of HTTP encourages centralization because it’s convenient to trust data with a few popular applications (i.e. Google and Facebook). However, because of this dependency, much of our data on the web becomes siloed, leaving those providers with enormous responsibility and power over our information. Sometimes, these companies get hacked, such as the recent Equifax hack which affected 145.5 million users. Other times, companies improperly leak out data for their unknown reasons — perhaps for financial benefit. For example, the recent Facebook data leak scandal which affected 87 million users.

All data stored using IPFS is decentralized. Here are the 5 main components of IPFS that allows it to store and transfer data in a decentralized manner:

· Distributed Hash Tables (DHT)

· Block Exchanges

· Merkle DAG

· Version Control Systems

· Self-Certifying File System

Distributed Hash Tables

In distributed hash tables (DHT), data is spread across a network of computers, and coordinated to enable efficient access and lookup between nodes. Using DHTs, IPFS nodes do not require centralization, the system can function reliably even when nodes fail or leave the network, and DHTs can scale to accommodate more nodes.

Block Exchanges

The popular peer-to-peer file sharing software Bittorrent is able to coordinate the transfer of data between millions of nodes by relying on an innovative data exchange protocol used exclusively on its ecosystem. IPFS implemented a similar version of this protocol called BitSwap, which operates as a marketplace for data.

Merkle DAG

Merkle trees ensure that data blocks exchanged on p2p networks are correct, undamaged and unaltered. This verification is done in a similar fashion Bitcoin ensures each transaction is correct and unaltered: by using cryptographic hash functions. A DAG is a way to model a sequence of information that have no cycles. A simple example of a DAG is a family tree. With a merkle DAG structure, all content on IPFS can be uniquely identified, since each data block has a unique hash. Plus, the data is tamper-resistant because to alter it would change the hash. In short, the same technology that ensured Bitcoin has not been hacked in nearly a decade is now being used under the hood on IPFS.

Version Control System

Another powerful feature of the Merkle DAG structure is that it allows you to build a distributed version control system (VCS). GitHub uses merkle DAG to store and merge versions of the same file together. IPFS uses a similar model for data objects wherein all file changes and revisions are linked to the original file.

In the above example, the original file is a photo of “Cat1”, and all subsequent edits to the file (Cat2 and Cat3) are linked to the original parent file. As long as objects corresponding to the original file, and any new versions are accessible, the entire file history can be retrieved. This means IPFS files can be cached yet stored permanently, saving bandwidth and storage space for users.

Self-Certifying File System

The last component of IPFS (and this one is important) is the self-certifying file system. It is “self-certifying” because data served to a client is authenticated by the file name (which is signed by the server). Therefore, IPFS doesn’t require special permissions from any centralized authority for data exchange. You can securely access remote content with the transparency of local storage.

IPFS utilizes all the above components to spread data across a network of computers and ensure the network is decentralized, fault tolerant and scalable. It can be used to deliver content to websites, store files with automatic versioning and file backups, securely transfer files and encrypted messages, and much more.

3. Online Content Heavily Censored: Data on HTTP is prone to censorship

As illustrated earlier, IPFS is decentralized and completely permission-less, meaning anyone can store or transfer any file without central authority control and without fear of censorship. But nowadays, the internet that we access across much of the world is at least partially censored. Take China, for example, where the internet is heavily censored and online free speech is virtually inexistent. The internet in China has 24-hour surveillance to watch for certain offences that could lead to account deletion, questioning, and even detention (see China’s censorship approach below).

IPFS can be a useful tool in promoting free speech to counter the prevalence of internet censorship in countries like China, but we should also be cognizant of the potential for abuse by bad actors. In a decentralized web, policing dangerous and inhumane online activities such as human trafficking, child pornography, phishing, and terrorist activities could become crucial to ensure safety for all users.

Final Thoughts

Here’s a quick recap of IPFS’s main selling points:

· Transferring of data is efficient and fast

· Storing of data is secure and decentralized

· Online content cannot be censored

And a highlight of the 5 components that allows for a decentralized and efficient web:

· Distributed Hash Table: nodes can store & share data without central coordination

· Block Exchanges: data is instantly authenticated and verified by public key cryptography

· Merkle DAG: enables uniquely identified, tamper-resistant and permanently stored data

· Version Control System: You can access past versions of edited data instantaneously

· Self-Certifying File System: You can securely access remote content without permission

Despite the impressive performance of IPFS, a few issues are yet to be fully resolved. Firstly, content addressing on IPNS is currently not very user-friendly. Your typical IPNS link looks like this:

Secondly, there is little incentive for nodes to maintain long term backups of data on the IPFS network, meaning files can theoretically disappear over time if there are no nodes hosting the data.

These issues are significant problems to solve if IPFS is to become the world’s go-to file storage solution. RTrade is attempting to change all this with TEMPORAL®, their native gateway into IPFS. The platform is attempting to make IPFS easier and more accessible for the public, using simple user-interfaces and user-friendly content addressing links. RTrade has a million-dollar facility based in Vancouver BC that stores and backs up data on the IPFS network, meaning any file you host on IPFS will be safeguarded indefinitely.

If companies like RTrade are successful, IPFS can provide resilient infrastructure for the next generation of the web — a web that is distributed, secure, and transparent. With TEMPORAL® you don’t have to be an expert to use IPFS, so if any of IPFS’ advantages seem useful or appealing to you, login to TEMPORAL® & get started here.

Join TEMPORAL’s official Telegram group:

Enterprise Solution For Distributed Data Storage

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store