IPFS Scalability

Building an IPFS Layer 2

Kyle Tut
Pinata
6 min readJul 17, 2019

--

Speed and availability. It’s the biggest challenge in IPFS. With hundreds of applications connecting to our IPFS nodes, we often find ourselves at the very limits of what IPFS can handle. Drawing from this experience, this post will lay out how we think about IPFS scalability at Pinata. Additionally, we will discuss how Web3 applications can leverage nodes like ours with their nodes to build applications that are fast and decentralized.

IPFS Doesn’t Understand Decentralization

Okay, first things first. We have to clarify something about the IPFS protocol.

IPFS itself doesn’t understand the difference between “decentralized” or “centralized” nodes.

A node is a node, no matter where it is hosted. When IPFS requests data from the network, IPFS retrieves that data from nodes based purely on whichever node is the best option at serving the data. IPFS doesn’t just retrieve data from decentralized nodes because the Web3 app using the protocol wants the data to be retrieved from decentralized nodes. This is true for blockchains, too. Ethereum doesn’t differentiate between a full node centralized in the cloud or a full node decentralized in someone’s basement. A node is a node, no matter where it is hosted.

Highly Connected Nodes Are Faster

Second, not all nodes within the IPFS network have the same ability to retrieve and store data. This is because computing resources matter. As an example, a laptop can’t serve data as well as a server can because the laptop has less resources to do so. With IPFS, resources become important when understanding how data is found within the network itself. Put simply, the more connections a node has with other nodes, the quicker it will be able to find data and have its own data found. However, these connections require more computing resources to run. This, in turn, leads to servers having an advantage in retrieving and serving data quickly. Again, the same situation can be seen with blockchains. An Ethereum mining facility’s resources are better at mining than someone’s mining equipment in their basement. The amount of computing resources matter.

The IPFS base layer is resilient through decentralization.

A Decentralized IPFS Base Layer

To recap, IPFS views decentralized and centralized IPFS nodes as one in the same. Furthermore, highly connected IPFS nodes are better at retrieving and serving data but require more resources. The question is, how do we leverage these two properties of IPFS to build an application that is both fast and decentralized?

We believe that Web3 apps should build with decentralization as a base layer.

With IPFS, we’ve seen that Web3 apps typically build a base layer of IPFS nodes that store master copies of app data. This base layer, maintained by the app’s servers or its users’ computers, is resilient and decentralized. But, it’s also slow. This is where most Web3 applications find themselves today. They’ve built a decentralized application but lack the speed, availability, and user experience users want from an application. This is caused by an architectural choice to rely on decentralized nodes that struggle with staying highly connected. The good news is that this problem can be alleviated without sacrificing the resiliency of the decentralized base layer. We do it by creating a fast layer of highly connected nodes that applications can use to mirror the same data they’re already hosting. When users attempt to retrieve application data from the IPFS network, many times Pinata’s nodes will be able to provide it faster than the pre-existing base layer of nodes. This improves the user experience while also reducing stress on the decentralized base layer of nodes. We think of it as an IPFS Layer 2.

IPFS layer 2 connects with the base layer to provide speed and availability.

A Fast IPFS Layer 2

Remember how IPFS doesn’t discriminate between nodes? We use that to our advantage to easily connect with Web3 apps to provide speed and availability to their IPFS nodes. How? We run nodes that are connected to up to 100x the amount of nodes most other nodes are connected to. Because of this, our nodes can retrieve and serve data from the network, fast. This is IPFS scalability in production. Our IPFS nodes work together with our users’ decentralized nodes to provide a Web3 that is fast, secure, and more resilient than Web2. So, what does that look like? Below, we will look at a few scenarios.

Speed Scenario

Imagine a user wants to listen to a podcast on their phone through a Web3 app that uses IPFS. The app’s IPFS network has ten IPFS nodes that it could retrieve the podcast from. Nine nodes are users’ decentralized laptop nodes. However, the app may have a hard time discovering data from these nodes. This is because they aren’t highly-connected with other IPFS nodes in the network. The other node in the network is a Pinata node that is highly connected and can quickly send the podcast to their phone. In this case, it’s likely that the app will be able to find data from the Pinata node much quicker, thereby improving the user experience.

Availability Scenario

Now, imagine a user wants to send a message through a Web3 app using IPFS from their phone to their friend’s phone. However, the friend receiving the message has their phone offline and the message can’t be received. Where should the message be stored in the mean time? Should the message stay on the sending friend’s phone until the receiving friend comes back online? What if the sending friend is offline when the receiving friend comes back online? How does the message get to the friend without delay? We’ve accidentally created decentralized phone tag! To solve this scenario, Pinata’s node could be used as an availability node to temporarily store the message to help the other phone receive the message faster.

Speed and Availability Scenario

Finally, what if someone wanted to host The Best Website Ever from an IPFS node on their laptop? Awesome! Totally possible. They get it all set up and their first visitor emails them saying their website is slow. Unfortunately, their laptop just isn’t connected enough with the IPFS network for the website to be found quickly. They think to themselves, “Oh, well. This is just part of the Web3 experience,” and shut their laptop. A few minutes later, they receive another email, this time on their phone. Their second visitor tells them their website is no longer online! They forgot they needed to keep their laptop up and running, at all times! In this instance, this person could use Pinata to mirror their laptop node’s website data. This would allow them to use our nodes for speed and availability to cover for when their laptop isn’t the fastest option and to maintain uptime for the website when their laptop is offline.

Just Another Node in Your Network

We believe that Web3 apps should build with decentralization as a base layer. With the current internet becoming more fragile and falling susceptible to increasing outages, decentralization is the only way for apps to continue to run. We believe an app’s base layer of decentralized IPFS nodes should replicate master copies of the app’s data to maintain control and ensure uptime. To solve for speed, these apps can connect with a Layer 2 of IPFS nodes that easily provides speed and availability without sacrificing decentralization. IPFS can scale. It just requires a little bit of team work. If you have questions or need any help building with IPFS, be sure to jump into our Slack.

--

--