The Foundation For The Next Killer Application?
The race is on to discover blockchain’s elusive killer app. With its multitude of applications, spanning from DeFi to GameFi, and other smart contract and NFT-enabled use cases such as social media, the possibilities are simply boundless.
However, those kind of applications need a solid foundation and there’s a use-case that deserves more recognition than it’s currently receiving — on-chain storage. This approach safeguards the integrity, accessibility, and security of your data by distributing files across a peer-to-peer network. In contrast to conventional storage systems that house data in centralized servers, on-chain storage eliminates the risks of censorship, data loss, and data theft.
Let’s now explore the world of storage networks and their potential to serve as the bedrock for the on-chain applications of the future.
The Data Stack
Before we dive into the nitty-gritty of on-chain storage, let’s take a quick peek at the modern data stack and how it applies to our on-chain world.
The four pillars of modern data analysis consist of an ELT-driven data collection mechanism, a robust data warehousing system, a transformational data toolset, and a cutting-edge business intelligence suite. These categories work in harmony to extract data from databases and third-party tools, store it in a secure and reliable way, transform it into analysis-ready models, and finally produce in-depth reports using these models.
The Web 3 Data Stack
In the traditional business world, data is primarily generated by users and organized in product databases. This includes data such as user emails and addresses when they sign up, as well as sales and revenue figures from SaaS products like Salesforce and Stripe. Recently, tools like Segment have emerged to capture event data like user clicks.
But in the world of blockchain, data is immutably stored on the ledger, providing information on all product data found in traditional systems, including users, wallet addresses, and event data.
In traditional data extraction methods, data is often gathered through tools that copy and transfer data from various sources into databases.
On the other hand, in the on-chain world, your data is stored immutably on a distributed ledger that relies on a network of nodes to validate transactions. With this architecture, every node on the network has access to the record of data on the blockchain. To extract this data, one can either operate a node and extract the data or use an API/RPC service to make calls to the node and obtain the data.
Although node providers and API solutions have traditionally been centralized, with one service provider managing the infrastructure and service, we have seen decentralized alternatives emerge. For instance, Ankr and Pokt are decentralized node providers where users pay native tokens to access RPC/API calls to nodes. Meanwhile, individuals are incentivized to run nodes by staking tokens to participate in the infrastructure and earn a share of the revenue stream. Furthermore, Infura recently announced plans to create a decentralized service for their API calls.
Loading And Transforming (Indexing)
Accessing previous transaction data on-chain can be a challenge due to the distributed nature of the networks. This is where indexing services come into play, as they store and index the network to allow users to quickly access relevant information. This is crucial for building fast applications and ensuring performant data collection.
However, we are seeing a divergence between centralized and decentralized indexing services. Tokenomic models such as those used by The Graph incentivize participants to provide infrastructure by rewarding them with native tokens for providing query services. These services offer the ability to query data using a specific language. We are seeing two main types of query languages: SQL and GraphQL. GraphQL has gained popularity since its open-source release in 2015, but SQL remains the preferred tool for data science. As a result, many enterprise-grade solutions are now available that allow users to query data using SQL.
Indexing data is crucial for quick access to previous transactions. These indexing functions often have their own storage solutions, such as The Graph, which stores queries using a native protocol or traditional data warehouses.
However, we’re also witnessing the emergence of decentralized storage options. These mechanisms allow for data to be stored in a peer-to-peer manner and accessed at all times through encryption. As more users join the network, data persistence is ensured through decentralization, with the data stored in multiple places.
Tokenomic models, such as those implemented by Sia, Areweave, Filecoin, and Storj, incentivize storage providers by rewarding them with tokens for offering hosting. Providers are required to stake a certain amount of tokens to ensure data is stored correctly and the network is maintained. To make stored data accessible and manageable, players like Filebase and Web3.storage have built solutions on top of this stack that allow users to use traditional services like HTTP, API, and Amazon S3 databases. Additionally, solutions like Pinata and Ceramic provide new mechanisms for interacting with stored data. Pinata, for example, allows for NFT data to be stored, while Ceramic provides user “data streams.”
These top layers of the storage stack enable other data to be stored in a decentralized manner and accessed for analysis, paving the way for a new era of blockchain-based data management.
Web3 Data Stack Use Cases
Once we’ve transformed the blockchain data, it becomes instrumental in powering a plethora of applications. These services make use of data to help users extract insights, visualize data, and add value to product insights.
The first wave of data usage came from explorers. They served as a simple way to search and look up transaction data on the blockchain. We’ve seen this technology evolve from basic explorers that show transactions, like Etherscan, to more advanced Web 3 search engines such as Neeva.xyz.
In addition, platforms such as Dune and Nansen have emerged that allow users to build and track tokens with ease through analytics dashboards. These tools are equipped with powerful data visualization tools.
For enterprise-grade users, tools like Metrika and Amberdata provide advanced data analytics tools that build on top of the traditional web3 data stack. These tools are tailor-made to cater to the needs of enterprise-level users who require more robust analytics capabilities.
Now, after providing a high-level overview of the web3 data stack, let’s deep dive into the reasons why we believe on-chain storage networks could be the most pivotal use-case for blockchain technology yet.
Everything Needs Storage
As the evolution of elevated level applications on blockchains transpires beyond its rudimentary DeFi predecessors, the demand for formidable infrastructure and middleware amplifies. Here at Moonrock Capital, we’ve expounded several times on our middleware ecosystem hypothesis that has taken shape, especially in facilitating decentralized applications.
It seems that the chatter about state machines has been overwhelming the discourse as of late. However, we must say that we have observed a conspicuous absence of buzz surrounding storage. Yet, let us be clear: we firmly believe that storage will play a crucial role in the forthcoming era of blockchain.
The Web3 Application Stack
Ethereum has successfully defined a novel dApp data architecture that’s worth discussing. This architecture comes with a few notable improvements that allows you to build immutable, transparent, and permissionless applications.
However, let’s shed some light on a crucial aspect of blockchain design — the trade-offs. As an example, Ethereum’s design has led to elevated data storage costs. These limitations make it arduous to cater to the demands of dApps that require substantial storage.
As dApps become more sophisticated with their utilization of dynamic data, we can expect to see an influx of users come onboard. And with mainstream adoption, we’ll witness an emergence of storage-heavy dApps across various verticals, including the likes of social and gaming.
In ’06, Clive Humby prophesied that data is the new oil. Fast forward to now, with the rise of eCommerce, Web apps, and IoT/AI, data’s been generated in massive quantities. But to extract value, firms gotta invest big in storage and management. They could build in-house data centers or trust in the clouds of Google Drive, DropBox, and AWS.
But traditional storage comes with issues. Large databases are prone to attacks, with losses occurring every year. And if the service provider goes offline or restricts access, owners lose control.
Now, why are Storage Networks so damn important? Well, as blockchain technology continues to gain momentum and more use cases are developed, the need for a reliable storage solution becomes increasingly crucial. And that’s where Storage Networks come in.
Here’s how it works:
When utilizing traditional cloud storage, your file is uploaded through the internet to a server, and whenever you require it, you must request it from the same server. However, with on-chain cloud storage, the data storage process is significantly different. Upon upload, the data is automatically encrypted using cryptographic hash mechanisms, and only your private key can grant you access to it. This way, unauthorized entities are kept at bay, ensuring the utmost privacy and security of your files.
What’s more, the files are broken into small pieces and distributed to different nodes on the network, a process we call sharding. This mechanism guarantees that no single node can hold the complete dataset, rendering censorship and privacy intrusion a non-issue. By scattering bits of your data across the network, we can guarantee that no one can read your information or restrict your access.
Finally, the sharded bits of your file are sent to several nodes located in different geographical regions. This approach ensures that the components of your file are readily available whenever you need them, and the network will retrieve and reassemble them for you to download seamlessly.
Let’s have a look into the possible benefits of on-chain storage.
Storage Networks are incredibly scalable, enabling them to handle the ever-increasing demand for data storage by simply adding more nodes or servers to the network.
But we must tread carefully. Storage networks may face performance and scalability limitations due to the inherent constraints of the underlying blockchain technology. These constraints can take the form of transaction throughput and block size limitations, which may hinder their ability to accommodate the growing storage demands of the modern era.
Distributed storage solutions can often prove more cost-effective than traditional centralized storage systems, as they bypass the need for pricey data center infrastructure and can leverage the untapped storage capacity of participating nodes.
However, it would be remiss of us not to mention the potential costs associated with participating in a blockchain-based storage network. These costs can come in the form of transaction fees and energy consumption, and they can be significant enough to outweigh the benefits for some users. It is important to carefully weigh the potential costs and benefits before diving headfirst into the world of blockchain storage.
In this digital epoch, safeguarding data is paramount. Inadequate data storage schemes can result in breaches, identity theft, and other quandaries that can weigh heavily on both companies and users.
Centralized storage services boasted of providing a solution to data storage challenges but have proven to be a letdown. A notorious breach of Dropbox, one of the most extensive cloud storage outfits globally, resulted in the leak of a whopping 68 million passwords onto the dark web.
On-chain, peer-to-peer networks are theoretically more secure than their centralized counterparts. This is what makes them the perfect choice for safeguarding sensitive data from malevolent actors.
To assail a decentralized storage service, cybercriminals must access every node that the protocol operates on. The massive costs involved in such an exploit are often sufficient to dissuade hackers from attempting to pilfer your data.
If it wasn’t crystal clear by now, entrusting confidential data to a company’s server is a recipe for privacy debacles of seismic proportions. Even if the data is encrypted, the encryption key is still ensconced on a server. Astute hackers can filch these keys and infiltrate your confidential information.
On-chain storage comes to the rescue, solving this quandary. The data is sliced and diced into various bits to shield it from prying eyes. To retrieve the original file, these bits must be pieced together — a task that’s insurmountable without the correct permissions or private key.
At first glance, one might think that centralized cloud storage is the way to go. After all, you can easily access your files on Google Drive or Dropbox just by logging in. But there are potential inefficiencies lurking beneath the surface of such centralized data storage and management.
Your valuable information is kept in just a handful of data centers scattered across the globe. Any disruption to these systems, and accessing your data becomes impossible. A simple distributed denial-of-service (DDoS) attack can take down seemingly robust networks and cut off access to centralized servers. Furthermore, the location of data centers can also be problematic for users located in far-flung areas. They may have to expend more bandwidth just to retrieve files stored on a cloud storage platform.
This is where on-chain storage services shine. They operate on a robust peer-to-peer (p2p) architecture, with multiple nodes located across various regions holding copies of a file. Even if some nodes go offline, your information remains accessible. These systems are fault-tolerant, so a few malfunctioning nodes cannot disrupt their operation. And let’s not forget about blockchain-powered storage, which has the potential to greatly reduce bandwidth usage. The servers holding your files are distributed worldwide, making it possible to find a server closer to your region. This reduces the effort required to download files and minimizes bandwidth usage.
In other words, files should remain accessible and in their original form for years to come.
However, centralized storage systems struggle with implementing data integrity. Why, you ask? It’s because traditional storage methods rely on a location-specific approach to storing information.
Let’s say you need to access a webpage on a site, like moonrockcapital.io. You’d typically access it by entering the link to the file path in your browser. This link points to where the webpage (i.e., the file) is hosted. If the file is in its original location, you should get the webpage. But what happens if the server holding the file experiences an issue or if the webpage gets moved to another location? The data simply becomes unavailable, resulting in a frustrating “dead link” error.
To address this issue, on-chain storage systems use a content-specific approach. This approach identifies data by its content, not its location. Every piece of content has a unique alphanumeric string called a hash, which serves as a unique identifier for data. In contrast to centralized storage, where you access a specific location to retrieve data, with on-chain storage, you ask anyone on the network who has a version of the webpage to make it available. This approach ensures that data remains accessible and intact, no matter where it’s stored.
Because hashes are unique to content, it is impossible for anyone to pass off a fake file as genuine. Any alteration to the content would result in a different hash, leading to a different-looking link from the original. On-chain storage ensures that data remains accessible forever and remains intact, making it a great solution for ensuring data integrity.
As we have seen by now, the use cases for Storage Networks are endless. You can store anything from personal data to medical records, financial information, and even files and media. And if that wasn’t enough, Storage Networks also provide a new source of income for users. You can rent out your unused storage capacity to others on the network, creating a whole new revenue stream.
Centralized storage has been a go-to option for a long time, but its drawbacks are increasingly exposed. Luckily, on-chain storage networks offer a superior, cost-effective, and more secure alternative for data storage.
Although decentralized storage networks are in their nascent stage, their adoption is on the rise. As the need for efficient data storage skyrockets, decentralized storage models will soon become a necessity for individuals and businesses alike.
Who We Are
Moonrock Capital is a Blockchain Advisory and Investment Firm, incubating and accelerating early stage startups since 2019.
Disclaimer: None of the information contained here constitutes an offer (or solicitation of an offer) to buy or sell any currency, product or financial instrument, to make any investment, or to participate in any particular trading strategy.