A Hands-on Introduction to IPFS

The Interplanetary File System is going to be a big deal.

This article will be divided into two parts:

  • A theoretical introduction to IPFS
  • A practical guide on using IPFS

The first part aims to show you how IPFS works, while the second part will walk you through using IPFS and interacting with the already existing network. The second part assumes some technical knowledge and a *nix based system.

Source: smg-corporate.com

Theoretical Introduction

HTTP, or the HyperText Transfer Protocol, is the peer-to-peer communication protocol that governs the internet now. It’s how you access websites, watch video’s, download files. There are some problems with it however, a lot of it stemming from the fact that the current model is largely centralized.

What’s wrong with the Internet of today?

When you want to visit a website today, your browser (client) sends a request to the servers (host) that “serve” up that website, even when those servers are across the globe from your current location. This is location-based addressing, and it uses IP-addresses to show your location. That process eats bandwidth, and thus costs us a lot of money and time. On top of that, HTTP downloads a file from a single server at a time, which is way worse than getting multiple pieces of that file from multiple computers. It also allows for powerful entities to block access to certain locations, like Turkey did with the Wikipedia servers in 2017.

HTTP vs. IPFS [Source: https://www.maxcdn.com/one/visual-glossary/interplanetary-file-system/]

Juan Benet, with IPFS, plans to solve a lot of the aforementioned problems. For one, IPFS doesn’t have a single point of failure. It is a peer-to-peer and distributed file system that can literally replace HTTP and decentralize the Internet once again. Internet censorship will largely be impossible, and published information won’t suddenly disappear at the whims of a service provider or hosting network.

Security also increases — DDos attacks, for example, wouldn’t work, since they rely on attacking a central distribution system, which IPFS doesn’t have. Speed is another factor that increases. In a distributed web, every node that requests something, requests it to the node closest to him, instead of to a single, central location.

How does it work?

IPFS works by connecting all devices on the network to the same file structure. This file structure is a Merkle DAG, which combines Merkle trees (used in blockchains to ensure immutability), and Directed Acyclic Graphs (used in Git version control, which also allows users to see the versions of content on IPFS). Think of it as a large BitTorrent swarm. Imagine you want to read the IPFS whitepaper. What you would normally do is type in a URL, which can be resolved to an IP address, which provides information about the location of the file (would be the IPFS servers, if they had those), which then allows your client to make a connection with the host and get the file. There are numerous ways in which this can go wrong, a lot of which have been discussed above.

IPFS [Source: https://ipfs.io]

Now, imagine accessing it from the IPFS network. The file, and all of its blocks, are identified by a unique cryptographic hash of the content itself. The whole system is based around a key-value data store. This is what allows for the content addressing: anyone can host the key no matter the origin of the information. So, you would connect to the swarm and request to the network that file. It would first look to the peers closest to you, because chances are they have a copy of that file. If they don’t, however, you will connect with the node that originally uploaded the file, since he’s the one that hosts it.

Another good explanation of IPFS with video animations by Simply Explained

You then download that file, and become a host yourself. This means you’re basically the host and the client at the same time. It also means that you only host files you are interested in. Let’s now go to the IPFS whitepaper on IPFS itself. Since chances are you don’t have IPFS installed, we’ll use a gateway. The URL looks like this: https://gateway.ipfs.io/ipfs/QmV9tSDx9UiPeWExXEeH6aoDvmihvx6jD5eLb4jbTaKGps. It starts with the IPFS gateway URL, and then includes the unique hash identifier of the content. The gateway will connect to the closest peer that has the file (which is probably itself), and then serve that file to you.

You might think downloading files from untrusted nodes is dangerous, and that would be a good remark, but since cryptographic hashes are tamper-proof, and the hash is of the content itself, you can be sure you’ve got the right file.

We make websites and web apps have no central origin server, they can be distributed just like the Bitcoin network is distributed. — Juan Benet

IPFS serves content, but content can be much wider than just a pdf-file. Application logic is content as well. In combination with other decentralized projects like Ethereum, IPFS is extremely powerful as well. A blockchain like Ethereum can then serve as the back-end of your App, while the front-end is served by IPFS. This makes for a completely decentralized application.

To really drive the point home, let’s end with another example. Imagine a room full of people that are collaborating on a Google doc at the same time. With HTTP, every time a change is made, the information is sent back to the server, and then from the server back to every other student. You can imagine the bandwidth that is wasted and the unnecessary latency that is produced, especially for people on the same local network.

With IPFS, however, the information wouldn’t have to make all those unnecessary travels, it would just be shared real-time across all those people on the same network. Even if the Google servers were down, since every node is now being its own host. This would also significantly reduce latency and bandwidth.

This might also be a good time to explain why it’s called the Interplanetary File System. Think of it this way, if the group of people we just talked about were on Mars, they would be able to get parts of the internet usable on Mars like this: one person requests Google docs, or any other file, from any node on the IPFS network back on Earth. This would take a while (about an hour), but once the file has arrived (and remember, this could be anything, Wikipedia broken down are also just files), the person that originally requested it becomes a “seeder” in the network from which other nodes can get the file. If this process goes on long enough, eventually most parts of the internet will be available on Mars without having to wait hours after sending a request.

IPFS is still in alpha, and a ton of developments are still underway. One very cool one is Filecoin. Filecoin is a cryptocurrency that incentivizes nodes on a network to store (seed) content with economic rewards, so that the accessibility of content doesn’t depend on the whims of untrusted nodes.

Filecoin [Source: https://filecoin.io]

Below is a video in which Juan Benet explains how IPFS and Filecoin would work together:


Practical Guide

Getting Started

This is the part where we actually connect to IPFS. To start, you’ll need to install it on your system. You can install IPFS using snapcraft, with snap install ipfs , but I recommend installing from a prebuilt package. Download the package here. Next, use:

tar xvfz go-ipfs.tar.gz
cd go-ipfs
./install.sh

This will untar the archive, and the install.sh script will move the binary to your executables $PATH

To see whether or not it installed correctly, use

$ ipfs help

This should give you a nice overview of the IPFS commands.

Now let’s get started. First, you’ll need to initialize the ipfs repository. Do this with $ ipfs init . This will create a key-pair for your node and a repository with some ipfs objects. Take a look at them! You should have gotten a command like this:

$ ipfs cat /ipfs/QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv/readme

Which should return:

Hello and Welcome to IPFS!
██╗██████╗ ███████╗███████╗
██║██╔══██╗██╔════╝██╔════╝
██║██████╔╝█████╗ ███████╗
██║██╔═══╝ ██╔══╝ ╚════██║
██║██║ ██║ ███████║
╚═╝╚═╝ ╚═╝ ╚══════╝
If you're seeing this, you have successfully installed
IPFS and are now interfacing with the ipfs merkledag!
-------------------------------------------------------
| Warning: |
| This is alpha software. Use at your own discretion! |
| Much is missing or lacking polish. There are bugs. |
| Not yet secure. Read the security notes for more. |
-------------------------------------------------------
Check out some of the other files in this directory:
./about
./help
./quick-start <-- usage examples
./readme <-- this file
./security-notes

the $ ipfs cat command allows you to see the content of ipfs objects. Try checking out those other files in the directory.

Adding Files to IPFS

Adding files to IPFS is really simple. Move to an IPFS test-repository and make a simple file:

$ echo "Some text!" > IPFSfile

To then add the file to IPFS:

$ ipfs add IPFSfile

This will return a hash that starts with Qm that serves as the unique identifier for the content of that file. You have now recursively pinned this file to your local storage, which means that once you’re connected to the swarm, people that have that hash identifier will be able to request that file to you.

To see which IPFS files are on your system, use:

$ ipfs pin ls

This will list all the pinned files, and you should (recursively) see the file you just made (as its hash identifier), as well as the files that were made when initializing ipfs, like readme and quick-start. To look at those files, you can ipfs cat them, unless they are a directory, which is also possible. In that case, use $ ipfs ls to look into that directory. If you wish to add a whole directory, simply add -r to the ipfs add command.

Interacting with the Swarm

Everything we’ve done so far, was done locally. We haven’t connected to the swarm yet, to do that, initialize the IPFS daemon:

$ ipfs daemon

You can now see the details of your connection using $ ipfs id . Let’s check out our peers:

$ ipfs swarm peers

Note that they all have a unique hash as their ID. You can inspect them with $ ipfs id <insert hash>

The IPFS daemon set up a localhost which allows you to interact with the IPFS network through your browser. The default is 8080.

Let’s look at our files through the web-browser:

localhost:8080/ipfs/<hash of file>

You should now see your file. As before, you can read the IPFS whitepaper on the network by inserting the hash of its content (QmV9tSDx9UiPeWExXEeH6aoDvmihvx6jD5eLb4jbTaKGps). Try for yourself!

The daemon also comes with a nice Web UI for your node. The default URL is:

127.0.0.1:5001/webui

Explore it a bit! See where all your peers are, try uploading files using the UI instead of the command line.

You can see IPFS has come far, but we’re not there yet. Even though it is quite accessible, the normal person shouldn’t even know something is different, this is the only way for the internet to migrate protocols. Juan Benet stresses this a lot and he’s right. I, however, am already very excited with all this stuff. Especially in conjunction with Blockchain, all this infrastructure is very important.

Thanks for reading! If you liked it, considering following me. I plan to continue publishing here on Medium.

I also have a Youtube channel on which I post tech-related videos. Check it out here.