Web 3.0 Infrastructure at a Glance

Dave Hafford
34 min readMay 10, 2022

--

Welcome to Neuron Activation!

Welcome to the first post of Neuron Activation! When I first decided to write this blog, I decided that while the world may not be in desperate need of another VC blog from an MBA, there is precious little content that breaks complex concepts down to simple parts.

The purpose of my blog is to share my research, musings, and thoughts in what I hope is an easy-to-digest, entertaining fashion. My goal is to address technical topics such that they are easy to comprehend, as well as provide tangible insights for those looking to educate themselves on what I cover.

A bit of background on myself, I’ve worked in venture capital, investment banking, and as the CFO of a small company in NY before attending business school here in Pennsylvania. In terms of useless knowledge, I’ve also completed all four sections of the CPA, was a sponsored skateboarder, and have spent god knows how much time as a competitive gamer in various FPS titles.

A note before we dive in

This piece seeks to provide an overview of the blockchain/web 3.0 infrastructure stack. The blockchain ecosystem is vast, and products and businesses that provide resources and infrastructure solutions to emerging protocols consist of only a small part of that. However, this small part is critical to the widespread adoption of blockchain technology and the accompanying benefits of providing trustless solutions and disintermediation across different industries and use cases.

The below thoughts and conclusions come from consulting various subject matter experts in the space, including startup founders, product experts, developers, classmates at Wharton, and academic researchers. Some of this came from my participation in the Wharton program Venture Foragers, along with contributions from my own personal network.

It is worth noting that this topic is by definition somewhat technical, as it delves into the underlying infrastructure behind blockchains, an inherently complex topic in of itself. While my goal is for this to not require any technical knowledge nor specialized expertise, it does assume some very basic, high-level understanding of programming and blockchain terminology (i.e. what an L1 is, what an NFT is, etc.) If you don’t have this understanding and are starting from square 0, and still would like to read, I’d highly recommend the following primers. First, CBInsights does a good job of giving a high-level overview of Blockchain, second, one of the professors at Wharton (consulted for this piece as well), did a fantastic job of providing a DeFi primer. Last but not least, Gemini offers a “cryptopedia” that I’ve found to be a well-organized overview of definitions relating to blockchain, crypto, and so on. With all of that said, my hope is that this provides a solid base of information without requiring any significant amount of expertise.

Blockchain state of the union & why we should care

If you are interested in blockchain investing, or just in the state of the technology, it’s important to put some numbers around the meteoric rise of blockchain in recent history. Per Blockdata, as of the end of 2021, the total crypto market cap increased from $760B to $2.3T, and the total volume of crypto payments increased 100% from 2020. ~$250B was secured by the top 8 crypto custody providers in 2021, (that can be measured). The Bitcoin network surpassed Paypal in terms of quarterly volume processed by 61%, and Bitcoin’s Lightning Network capacity has increased 210% from 1,058 BTC to 3,300 BTC. In 2021, users sent at least $44.2B worth of cryptocurrency to ERC-721 and ERC-1155 contracts, the two types of Ethereum smart contracts associated with DeFi/NFT marketplaces and collections.

On the consumer-facing side of things, TVL (total value locked, a metric commonly used to estimate market value) inside the DeFi industry increased 1,100% during 2021 from $21.1B to $260B, and Stablecoin supply increased 2,651% from $5.4B to $148B. NFT sales have increased 5,100% from $340M to $17.7B in 2021 (although it is worth noting that some volume there is artificially inflated). DAOs have also emerged as a dominant market force. Per analytics tool Deepdao, there are over 4,832 DAOs with a total of $12.4B in digital assets as of May 10, 2022. Per CoinMarketCap, DAOs represent $17.4B of market cap as of May 10, 2022.

In the world of institutional capital (i.e. venture funds, growth equity funds, and so on), money has continued to pour into blockchain and crypto companies. 2021 saw $30.5B of funding into blockchain businesses, which is more than the total amount raised from 2017 to 2020. We even saw a $1B capital raise in December for NYDIG, and FTX raised more than $1B across two funding rounds. More recently, Yuga Labs (the company behind the Bored Ape Yacht Club) raised $450M in seed funding, marking one of the largest seed rounds in history. The first quarter of 2022 saw capital inflows from VCs over $14.6B, or around 48% of all the capital investment by the last year (source: Cointelegraph).

From the development side of things, the number of web 3.0 developers are at all-time highs and have been growing faster than ever in recent months. Per Electric Capital, over 18,000+ monthly active developers commit code in open-source blockchain projects, with 34,391 new developers committing code in 2021. The growth of web 3.0 developers has been incredibly strong, but it still represents a very small percentage of software engineers across the world, and we are very much in the earliest innings of web 3.0’s evolution. With all of that said, it is worth noting that the development landscape remains fragmented amongst different protocols and platforms.

Source: Electric Capital

The developer ecosystem has also seen some interesting shifts in protocol and project focus. Ethereum and Bitcoin added developers in 2021, growing by 42% and 9% respectively. Ethereum continues to have the largest ecosystem of tools, apps, and protocols, with a developer community 2.8x the size of the 2nd largest ecosystem. Ethereum draws 20% of total new crypto developers and has the best retention of developers who stayed beyond year 4.

Source: Outlier Ventures

On monthly active developers, per Outlier Ventures, Cardano edged out Solana and Ethereum, with a 29.7% increase of 131 MAD in 2021 compared to 101 MAD in 2020. Just below Cardano, Solana also grew materially, with 131 MAD reported in 2021 compared to 22 MAD in 2020. Ethereum saw significant growth as well, going up 25.6% from 103 MAD to 130 MAD last year. Polkadot also showed an impressive jump as they rose from 88 MAD to 126 MAD, a 44.1% increase on average per year across a 2-year period. So-called “Ethereum killers” Tron, EOS, Komodo, and Qtum have seen a decrease in core development metrics.

Multi-chain protocols like Polkadot and Cosmos saw a rise in core development and developer contribution, maintaining their growth achieved over the past year. Avalanche, their latest big competitor, saw tremendous growth, increasing weekly commits and monthly active developers by 4x over the year, arriving at a level of around 50% of that of Cosmos and Polkadot.

Decentralized storage protocols like Filecoin and Siacoin also experienced a rise in development, developer contribution, and adoption. In the year of its public launch, Filecoin joined the top 5 most actively developed blockchain projects.

Given the above, it’s clear that blockchain technology is here to stay, however, unlike web 2.0, web 3.0 still lacks an established tool-set for developers. Reading and analyzing transactions on Ethereum and other blockchains is usually done in a manual and time-consuming way. There are limited established tools to help developers identify bugs or to give them the visibility and level of detail available to web 2.0, and limited infrastructure providers from which to choose. Whilst quite a few companies and protocols are actively enabling the rise and adoption of blockchain, the competitive landscape is still in its very early days, and as such, there is ample opportunity for new winners to emerge.

Furthermore, there are a lack of widely adopted consumer and enterprise dApps (decentralized applications, i.e. blockchains or applications built on top of them) that have truly gotten to the point where I can say a traditional industry has been thoroughly disrupted. As a fundamentally disruptive technology where use cases exist in myriad massive sectors like financial services, insurance, infrastructure, healthcare, travel and mobility, retail, agriculture and mining, education, and entertainment, very few “killer dApps” have emerged. I don’t believe it is a controversial statement to say that this can be largely attributed to the nascent stage of the technology and broader adoption, as opposed to the lack of actual value-add blockchain can provide. That broader adoption is fundamentally hindered by the fact that developing and deploying blockchain technology is complicated, opaque, and requires significant blockchain-specific technical expertise. To get to the point of broader adoption, it is clear that there is a strong need for projects, platforms, tools, and underlying infrastructure that enable the creation of dApps.

Defining web 3.0 and its architecture

Web 3.0 has been defined in quite a few ways, originally popularized by investor Packy McCormick as “the internet owned by the builders and users, orchestrated with tokens.” I personally look view web 3.0 as the stack of protocols, applications, and projects that enable a fully decentralized internet. As one might imagine, this shift to true decentralization involves a paradigm shift in the underlying architecture, which I’ll run through briefly below.

At an extremely high level, today’s web 2.0 architecture includes client software (browser/application) and servers providing content and logic, typically controlled by one entity, demonstrated below:

Source: Geshan’s Blog

Web 3.0 architecture is significantly more complex, leveraging what’s called a universal State Layer (a data set, that acts as a trusted source for internet-based settlement in web 3.0) to allow applications to place some or all of their content and logic on to a blockchain. In contrast to web 2.0, this content and logic are both public and accessible by anyone (although it does require some technical expertise at the time of writing this). Additionally, users can exert direct control over this content and logic. Unlike in the past, users don’t necessarily need accounts or privileged API keys to interact with the blockchain.

While the frontend has many similarities, backend programming for a dApp is entirely different than for a web 2.0 app. In web 3.0, one can write smart contracts that define the logic of applications and transactions, and deploy those transactions onto a distributed ledger that uses a state machine to power new block validation and consensus, such as the EVM. Web servers and traditional databases, in this paradigm, take a smaller role as a significant amount of activity is done on, or around, the blockchain.

This interaction is enabled by wallets, (a device/program which stores the public and/or private keys for blockchain transactions), and nodes (connected devices that comprise the decentralized / distributed system that makes up the component parts of a blockchain). There are two types of agents that constantly monitor and engage with a blockchain — miners and nodes. Miners directly maintain and run the blockchain, whereas, nodes monitor and submit transactions to the blockchain. One can think of them as analogous to ISPs versus cloud services providers (e.g. AWS). Similar to how most applications today use AWS’s services to run their application backends, blockchain infrastructure providers, such as Infura or Alchemy, provide a similar level of backend support. When a wallet wants to submit a transaction to the blockchain, or query information from the blockchain, it makes a call to the node provider. Applications’ app servers can also interact with the node providers themselves, to keep the app’s logic up to date, by making similar RPC calls (RPC will be elaborated on further below, don’t worry!). A brief overview of the key architectural differences can be seen below:

Source: Emre Tekisalp

It’s worth noting that the above architecture overview is extremely high level, and by large informed by a piece authored by Preethi Kasireddy, which I recommend taking the time to read. In that piece, she outlines and details web 2.0 vs. web 3.0 architecture in significantly more detail. Additionally, Emre Tekisalp authored another highly relevant piece on the topic (and is the creator of the fantastic chart above!)

The Web 3.0 Infrastructure Stack

During the creation of this piece, I asked quite a few friends some variation of the following question: what do you need to build a web 2.0 application? Generally speaking, the answers included an API / application server, authentication layer, database, client-side frameworks, platforms, libraries, and storage. There are myriad resources available today providing access to those component parts.

Using the above core components, one can build out (or at least build a large portion of) a massive amount of web 2.0 applications. However, what does this look like in the world of web 3.0?

Perhaps unsurprisingly, as of today, building any web 3.0 app requires an in-depth understanding of blockchain networks, web 3.0 infrastructure, and web 3.0 development environments. That infrastructure is summarized in the below market map:

Note: There is an element of inter-relation between almost all of these layers, as such, this is more of a general breakout of the component parts of the web 3.0 stack, and the section order isn’t in any way set in stone.

Layer 0

Node Infrastructure

At its core, a blockchain is a distributed database, in which information is collected together in groups, known as blocks, that hold sets of information. Blocks have storage capacities and, when filled, are closed and linked to the previous block, forming a chain of data known as the blockchain. These blocks are stored on nodes. A node is a computer that carries out the key functions of the network, such as validating transactions, storing records of the blockchain, or submitting votes on network governance. If all devices can be accessed via the network, they are all considered to be nodes. All nodes on a blockchain are connected, and constantly exchange the latest data such that all nodes stay up to date. In short, node networks form the infrastructure of blockchain technology. While a deep dive into the various types of nodes is beyond the scope of this piece, a brief overview can be seen below:

Source: Bybit Learn

A full node is a device (i.e. a computer) that contains a full copy of the transaction history of the blockchain. Full nodes provide support and security for the network. Nodes download the blockchain’s entire history of transactions to monitor and implement the rules on an ongoing basis. A brief overview of full node subcategories can be seen here, (Source: The Block Research).

Pruned Full Nodes

  • Have a defined memory limit, meaning there is a certain amount of blocks that can be stored from the blockchain on the node
  • The oldest blocks are deleted to allow for new blocks to be added when the memory limit is reached

Archival Full Nodes: Keep the entire record of the blockchain history, can be subdivided into:

  • Authority Nodes
    – Authorize other nodes to join a blockchain network
  • Miner Nodes
    – Assist with transaction validation in Proof-of-Work (POW) blockchain networks
  • Masternodes
    – Don’t add new blocks
    – Maintain a blockchain’s ledger
    – Validate transactions
  • Staking Nodes
    – Assist with staking rewards and transaction validation in Proof-of-Stake (POS) blockchain networks

Light nodes (typically downloaded wallets) are connected to full nodes to further validate the information that is stored on the blockchain. The difference here is that light nodes are smaller in size and only hold information about partial blockchain histories. Additionally, there are other types of nodes that carry out special tasks, however, a discussion of those is out of the scope of this piece.

Node infrastructure providers allow for developers of new or existing decentralized applications to gain access to the requisite infrastructural resources required without the headache of manually building, maintaining, and troubleshooting a node network.

Remote Procedure Calls (RPC): RPC represent an integral part of blockchain technology. The concept of a remote procedure call, or RPC, in distributed computing, refers to the process by which a computer program causes the execution of a subroutine or a procedure in a different location, known as an address space. The idea behind RPC is that a program can call and execute a subroutine on a remote system just like it would call a local subroutine, but the network communication details are hidden from the user. Such remote address spaces in distributed computing refer to other computers in a network. RPCs, being a form of IPC or inter-process communication, utilize mechanisms that operating systems provide so that the processes themselves can manage the shared data in networks. Essentially, they allow one program to interact with or access a program on another computer. This is particularly useful for blockchains, which have to serve a plethora of incoming requests from various machines. At a very high level, RPC protocols allow users to query blockchain-related information (such as block number, blocks, node connection, etc.) and send transaction requests through an RPC interface.

Companies like Alchemy, Syndica and Infura provide this infrastructure-as-a-service, allowing developers to focus on high-level app development versus spending a material amount of time learning the nuts and bolts of establishing and communicating with nodes. It’s worth noting that many RPC providers own and operate their own nodes, and are themselves centralized companies. In the blockchain community, the dangers of this centralization are extolled somewhat often (shocking I know). This centralization introduces a single point of failure that can endanger the liveness of a blockchain. As the thinking goes, should Alchemy / Syndica / Infura experience problems or become the subject of hostile regulation, applications may not be able to retrieve or access data on-chain. As a response to this, decentralized RPC protocols like Pocket or Ankr have been developed, however, they have yet to gain meaningful market share from existing incumbents. Additionally, providers such as Quicknode have plans to decentralize at some point in the future.

A breakdown of different providers, their business models, and feature sets can be found below:

Source: The Block Research

Staking / Validators: The benefits of blockchain (i.e. immutable, distributed, tamper-proof) rely on a set of distributed nodes validating transactions on-chain (i.e. achieving consensus). However, individuals or entities need to run the actual nodes themselves to be incentivized via the tokenomics of the underlying blockchain. In proof-of-stake (PoS) networks, token holders (validators) need to “stake” or risk their tokens to validate transactions. Validators who verify legitimate transactions are rewarded with newly “minted” tokens but are penalized (“slashed”) for attempting to verify false or incorrect transactions to go through. This is in contrast to proof-of-work (PoW) networks where miners have to compete with one another to solve complex mathematical problems as fast as possible to validate a block.

Staking infrastructure requires the same core components as the node infrastructure discussed previously. Nodes, specialized software, cloud storage, hardware, etc. are all necessary components for managing proof-of-stake (PoS) networks. Staking-as-a-service (SaaS) providers build infrastructure, maintain nodes and offer tools that make PoS blockchains more widely accessible to less sophisticated users. It’s worth noting that proof-of-stake blockchains have their own unique types of nodes, detailed below (source: Bison Trails and The Block Research):

  1. Participation Nodes: Participation nodes are the basic building block of proof-of-stake networks. They validate transactions and create blocks, and, in return for executing this work, earn block rewards. This is done by locking a set amount of value that is “staked” in order for the node to become an active participant, or validator, on the network. Essentially, they produce useful work on-chain in exchange for rewards once active.
  2. Read/Write Nodes: Read/write nodes verify transactions, obtain information about transactions (query), and write data such as transfers or smart contract interactions (transactions) to the chain.
  3. Sentry (Proxy) Nodes: Sentry nodes stand between a participation node and the blockchain, allowing the participation node to complete its function while staying private and hidden from the public internet They function to protect the participation node from attacks by creating an extra barrier between the public internet and the participation node.
  4. Relay Nodes: Relay nodes serve as hubs for the network’s peer-to-peer (or node-to-node) communication layer. They connect to a participation node and maintain connections to many other nodes in order to reduce transmission time by maintaining open, efficient communication paths.

Services like P2P and Blockdaemon allow less savvy or well-capitalized users to participate in consensus, usually by pooling funds together. Per Jump Crypto, There is an argument that staking providers introduce an unnecessary degree of centralization, however, it is worth noting that in the absence of such providers, the barrier to entry for running nodes would be too high for the average user, likely leading to an even higher degree of centralization across node networks. While a detailed discussion of staking is out of the scope here, a great overview can be found in this Messari piece. Additionally, a breakdown of different staking providers and related companies can be seen below:

Source: The Block Research

Whether it be for implementing a blockchain solution or staking, for the average crypto enthusiast, the time, cost, and energy required to run nodes is prohibitive, leading to reliance on others to handle on-chain validation. Some of the drawbacks are listed below, courtesy of The Block Research and Alchemy.

Expenses

  • $80–100k per year in costs
  • $2–5k per month in Amazon Web Services (AWS) bills
  • $4–5k per month of engineering time

Time-Intensity

  • 25% of engineering resources can be spent managing nodes
  • 3–6 months on average to develop blockchain infrastructure
  • Up to 3 weeks recovery from network failures

Inconsistency: Nodes on average have issues once / 5 days

  • CPU spikes, memory leaks, disk issues
  • Inconsistent peering
  • Corrupted internal databases
  • Transaction broadcasting issues
  • Frequent bugs + regressions
  • 1 in 6 “stable releases” are broken

Decentralized Cloud: Simply put, cloud computing consists of the delivery of computing services — including servers, storage, databases, networking, software, analytics, and intelligence — over the internet (“the cloud”) to offer faster innovation, flexible resources, and economies of scale. In the past (i.e. web 2.0), these services have been highly centralized. While web 3.0 applications have similar needs to those filled by providers such as AWS or Azure, there is a hesitance to rely on any single service provider, and as such, decentralized alternatives have arisen. The three core components of the decentralized cloud are storage, indexing, and computation.

Storage: Per the Ethereum foundation, Decentralized storage systems consist of a peer-to-peer network of user-operators who hold a portion of the overall data, creating a resilient file storage sharing system. These can be in a blockchain-based application or any peer-to-peer-based network. In contrast to centralized cloud providers, decentralized cloud storage providers utilize distributed infrastructure designed to mitigate undue control or influence. These providers typically also utilize a permissionless structure that enables developers to employ their services with reduced restrictions and typically will prevent any centralized control over storage/computation. Conceptually similar to a blockchain, decentralized storage models draw their security from a widely distributed structure. This architecture helps make these systems more resistant to the hackers, attacks, and outages that have plagued large, centralized data centers. Some examples are detailed below (Source: Ethereum Foundation, Gemini, and others).

  • IPFS: The InterPlanetary File System is a protocol and peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespace connecting all computing devices.
  • Filecoin: Created by the same team responsible for the Interplanetary File System (IPFS), Filecoin is an open-source, cloud-based storage network that aims to improve upon some perceived shortcomings of traditional cloud storage providers. These pain points include insufficient trust, unreliable security, limited connectivity, poor scalability, and heightened dependency on centralized systems.
  • Siacoin: Like Filecoin, Sia is a blockchain network that facilitates decentralized data storage by leveraging users’ excess hard drive space and renting it to those who need it. Sia storage providers are able to monetize their storage space using Sia’s data storage marketplace and can earn rewards in the form of siacoin (SC) — the network’s native utility token.
  • Arweave: Arweave is a protocol that allows users to store data permanently, sustainably, with a single upfront fee. The protocol matches people who have hard drive space to spare with those individuals and organizations that need to store data or host content permanently. This is similar to how Uber connects drivers with people who need transport.

Indexing: Ordinarily, when data is stored on one server or with a single service provider, querying the data is relatively straightforward. Because of the distributed nature of blockchain nodes across L1s and different protocols, data can be frustratingly difficult to access and analyze. In many apps, you need features like relational data, sorting, filtering, full-text search, pagination, and many other types of querying capabilities. In order to do this, data needs to be indexed and organized for efficient retrieval. The inherent properties of the blockchain, such as finality, chain rearrangement, and uncalled blocks, are a double-edged sword: they make retrieval of accurate querying results hard. With that said, solutions like The Graph address this pain point. The Graph implements an indexing protocol that promotes accessibility of dApps through public and open APIs, which it calls “subgraphs”. This indexing tool, just like any indexing tool for traditional databases, is capable of locating and retrieving data on Ethereum.

Computation: Protection and integrity of data is not a guarantee by any of the major cloud service providers. Traditional cloud infrastructure has several points of failure and trust, including the centralized nature of the Domain Name System (DNS), dependency on one or multiple centralized cloud providers for service availability (i.e. AWS, Azure), and centralized storage of user data. Those providers don’t put an emphasis on privacy, protection, and integrity of user data, and those concerns notwithstanding, distributing computation across many nodes theoretically achieves a higher degree of fault-tolerance (if one or a set of nodes goes down, the network can still service requests with minimal disruption to performance). Decentralized computing involves critical application services being carried out by individual devices or nodes on a distributed network, with no central location. Generally one should not be able to point to a single service address and disable it to shut down core application functionality for any / all users. For example, Dfinity provides a platform that enables completely decentralized smart contract execution.

One final note, a few projects provide all three of these services (ex. Aleph or Akash)

Middleware

The next layer here is the middleware/interaction layer, i.e. where developers and users read and write data to the blockchain, or outside programs transmit data to the blockchain and between blockchains (i.e. oracles, etc.). In web 2.0, applications are heavy consumers of data, and the acquisition of that data is straightforward in comparison to the acquisition of data in the distributed, opaque world of web 3.0.

Data access: Current blockchain designs are highly inefficient (partly by design). They are bound by the trilemma where higher throughputs necessarily mean lower security and/or decentralization (see below):

Source: Definitely not me in MS paint

Modular architectures that specialize and split into discrete execution layers (e.g. rollups, volitions), security/consensus layer, and data availability layers (e.g. data shards) bring a massive increase in efficiency to the blockchain industry. For example Celestia provides a pluggable consensus layer, allowing developers to deploy their own execution layers to run on top. This enables more customizability and sovereignty for applications built on the platform.

Oracles: Oracles are entities that provide an interface between blockchains and external data sources, allowing smart contracts to utilize real-world inputs and outputs. Oracles provide a way for blockchain systems to access existing data sources, legacy systems, and advanced computations. Furthermore, decentralized oracle networks (DONs) enable the creation of hybrid smart contracts, where on-chain code and off-chain infrastructure are combined to support advanced dApps that react to real-world events and interoperate with traditional systems. Oracles provide financial data to power DeFi applications (for example, MakerDAO utilizes an oracle’s module to determine the real-time price of assets). As with node infrastructure providers, centralized oracles themselves can become just as compromised and susceptible to manipulation as any other third-party service provider. For this reason, many blockchain projects — including Chainlink (LINK), Band Protocol (BAND), Augur (REP), and MakerDAO (builders of DAI) are developing (or have developed) decentralized oracles, or even alternatives to the traditional model of access entirely, such as API3.

Block explorers: Block explorers enable you to search for information on a particular blockchain. Block explorers provide an online interface for searching a blockchain and enable users to retrieve data about transactions, addresses, blocks, fees, and more. Different block explorers provide data about different blockchains, and the type of information included will vary depending on the architecture of said blockchain. Some examples of popular block explorers are blockchain.com, Etherscan, and Blockscout.

Off-chain data: In addition to decentralized storage and on-chain storage, there may be a need to store data off-chain in a decentralized fashion. Storing data on-chain can be highly inefficient (i.e. replicating data thousands of times across the globe on a blockchain network isn’t needed for all data and would be prohibitively expensive in most cases). Given that, a common practice is to move data “off-chain” when possible and only reference it on-chain. A few key players are Ceramic Network, Textile ThreadDb, and Pinata (focused mostly on NFTs).

Messaging / Cross Chain Interoperability: As both the number of and usage of L1 and L2 protocols grows, so does the increasingly silo’d nature of the blockchain ecosystem. Across L1 protocols, developers are building cross-chain bridges to break down siloed ecosystems and connect users across web 3.0 to provide connectivity, low transaction fees, and low latency periods. Bridges allow for data, information, and even tokens to be shared across disparate blockchains. Bridges can generally be separated into three categories.

First, L1 bi-directional bridges, such as the Rainbow Bridge from Near, Wormhole from Solana, or the Gravity Bridge from Cosmos. These bridges connect native assets across disparate chains, with the goal of bringing increased liquidity onto the network by enabling token holders to transfer assets across chains to be used in various circumstances and applications.

Second, we have wrapped bridges (for example deBridge or BiFrost). In contrast to bi-directional bridges, wrapped bridges connect to other chains, wrap the original asset’s value in a native mechanism, and broadcast them onto the other chain (so effectively holding assets on one side as collateral for equal value on the other). Unlike bi-directional bridges, they can connect with any chain, and these types of bridges use oracles and their own security model, separate from the blockchains themselves.

Third and last are cross-chain liquidity chains, where there is no liquidity on either side of the bridge, and it is the bridge’s responsibility to provide said liquidity. They can bridge any protocol, and provide both incentives and utility through supporting applications. Bridges like Axelar and Connext have different levels of application and use cases across different networks.

Development

To reiterate a previous point, the blockchain ecosystem is vast, and products and businesses that provide resources to developers consists of only a small part of that. However, this small part is absolutely critical to the widespread adoption of blockchain technology. Generally speaking, developer tooling can be broken into the following categories. While this could likely be simplified somewhat, the below categories give a fairly robust overview of the current tooling resources available to most developers.

Smart contract development: Smart contract development tools consist of programming languages (for example Solidity and Vyper for EVM-compatible chains, or Rust for chains like Solana and Terra), development frameworks (a solid summary and ranking can be found here), and IDEs.

Programming languages: Likely the most fundamental part of any smart contract developer. For example, Solidity and Vyper are used for EVM-compatible chains, or Rust is used for chains like Solana, Polkadot, and Terra. Additionally, the usual suspects such as C++ and Java are used heavily.

Development frameworks: Smart contract development frameworks make developers’ lives easier by allowing them to deploy and test their smart contracts. Some popular frameworks are described below:

Hardhat: Hardhat is a development environment to compile, deploy, test, and debug Ethereum software. It helps developers manage and automate the recurring tasks that are inherent to the process of building smart contracts and dApps, along with providing more functionality around this workflow (i.e. compiling, running, and testing smart contracts).

Truffle: Truffle is a development environment (providing a command-line tool to compile, deploy, test, and build), framework (providing various packages to make it easy to write tests, deployment code, build clients, and so on), and asset pipeline (publishing packages and using packages published by others) to build Ethereum-based dApps.

Brownie: Brownie is a Python-based development and testing framework for smart contracts targeting the Ethereum Virtual Machine. It includes full support for Solidity and Vyper, contract testing via pytest, including trace-based coverage evaluation, property-based and stateful testing via hypothesis, powerful debugging tools, including python-style tracebacks and custom error strings, a built-in console for quick project interaction, and support for ethPM packages.

Embark: Embark is a fast and easy-to-use developer environment used to build and deploy dApps. It integrates with EVM blockchains, decentralized storage protocols like IPFS and Swarm, and decentralized communication platforms like Whisper.

Other alternative (and perhaps less popular) frameworks include Waffle (JS), Dapp.Tools (Haskell/CLI), SBT (Scala/CLI), and Epirus (Java).

IDEs: An integrated development environment (IDE) is software for building applications that combines common developer tools into a single graphical user interface (GUI). An IDE typically consists of three components. 1) source code editor: A text editor that can assist in writing software code with features such as syntax highlighting with visual cues, providing language-specific auto-completion, and checking for bugs as code is being written. 2) Local build automation: Utilities that automate simple, repeatable tasks as part of creating a local build of the software for use by the developer, like compiling computer source code into binary code, packaging binary code, and running automated tests. 3) Debugger: A program for testing other programs that can graphically display the location of a bug in the original code. A brief list of EVM IDEs can be seen here. Non-EVM IDEs include Alon (Solana), Substrate (Polkadot), and Cosmwasm (Cosmos).

Blockchain network testing: Blockchain testing is the systematic evaluation of the blockchain’s various functional components (e.g., smart contracts). Unlike traditional software testing, blockchain testing involves several components such as blocks, mining, transactions, wallets, and so on, all of which require special tools to test, some of which are listed here.

Frontend / Backend Tooling: Various tools that make it easier to develop applications and allow you to interact with local or remote nodes. For example, the web3 JavaScript library interacts with the Ethereum blockchain. It can retrieve user accounts, send transactions, interact with smart contracts, and more. Some tools for Ethereum are listed here, other (non-Ethereum) tools include ecosystem specific tools such as Anchor IDL for Solana smart contracts or Ink for Parity contracts.

Other tools: Myriad other tools exist, including no-code tools like Atra, compilers, gas price watchers, other smart contract libraries (including upgradeability tools such as OpenZeppelin or DAO tools like aragon).

Smart Contract Security: While web 3.0 security is superior to classic apps in theory (due to it’s inherent resistance to malware, injection, DoS attacks, etc.), in practice, blockchain has seen a massive amount of value leave its system as a result of security lapses. Security in blockchain is relatively underdeveloped, however, is critical to broader adoption of blockchain going forward.

On the DeFi side, 2021 has seen more than $610M stolen through exploits (from $77M in 2020). Furthermore, $704M in funds were stolen and then later returned by white hat hackers, like those behind the $600M PolyNetwork exploit. While some consider these events as inevitable growing pains, they do highlight several major vulnerabilities in the technical infrastructure powering DeFi, which may limit DeFi’s potential to capture more users and use-cases. Going forward, it’s hard to imagine a world where security is not a massive focus for new projects, and white space exists around smart contract auditing, precise runtime monitoring, and consumer protections. While blockchain security is a massive topic worthy of textbooks of material in and of itself, the below breaks out secure development in some high-level areas.

Development & code review/auditing: Myriad vetted smart contract libraries and other resources for developers exist, and companies like Tenderly provide security-first development platforms. Existing frameworks like Waffle and Truffle provide unit testing and other services. Additional resources like simulation tools (Chaos Labs and Gauntlet leverage scenario-based simulations to secure blockchains and protocols), and test networks (representations of mainnets to test in a production environment such as Rinkeby, Kovan, Ropsten for Ethereum) exist as well. On the auditing side, many projects will engage third-party auditors to check and validate each line of code. Some of the major players here are Trail of Bits, Certik, Open Zeppelin, and Quantstamp. It is worth noting that there is an incredibly high demand for audits, and wait times are in the months.

Runtime analytics/verification: Runtime analytics involve an analysis and execution approach based on extracting information from a running system and using it to detect and possibly react to observed behaviors satisfying or violating certain properties. Companies in the space include Tenderly, Certora, and runtime verification.

Bug bounty/penetration testing: Bug bounty platforms such as Immunefi, HackenProof, or HackerOne involve monetary rewards for community members andwhite hat hackers to resolve and report security issues.

Additionally, the following resources provide deeper detail:

  • SlowMist provides a detailed 2021 blockchain security industry review
  • I_am Prime provides a good overview of security threats categorized into three major areas
  • Natalie Marie provides an awesome exploration of the current state of blockchain security

Layer 1/2

It’s my hope that if you’ve made it this far, you are aware of what layer 1 refers to, but if not, a layer 1 protocol is a network that acts as infrastructure for other applications, protocols, and networks to build on top of. A public decentralized layer one network’s primary characteristic is its consensus mechanism. Different consensus mechanisms provide different levels of speed, security, and throughput. The Block released a detailed report which goes into the L1 network ecosystem in-depth, however, generally speaking, one can look at L1s in two major groups, EVM or non-EVM compatible networks.

Historically, Ethereum has been the primary platform for web 3.0 development, where its virtual environment (i.e. its Ethereum Virtual Machine or EVM) stores key information like accounts and balances. Ethereum’s Virtual Machine also stores a machine state, which is able to change with each new block according to a set of predefined rules laid out by the EVM. Most importantly for developers, the EVM provides a framework for the storage and execution of smart contracts which allows developers to program on-chain logic. Some examples of EVM-compatible blockchains include Ethereum (obviously), Avalanche, Cosmos, and Tron. Non-EVM compatible chains include Solana, Flow, or NEAR.

Layer 2 / Scaling Rollups

While from an infrastructure stack perspective, there is a high degree of inter-relation between L1s and L2s / scaling solutions, I’ve broken it apart to provide a bit more granularity into my own view of the ecosystem. Scalability is a material concern in Ethereum development. A scalable blockchain can handle thousands of transactions simultaneously, and ideally not need to charge prohibitively high transaction fees.

Ethereum’s L2 Scaling techniques can be seen graphically below (coming from a fantastic primer by Token Terminal)

Source: Token Terminal

Due to the high number of dApps and use-cases built on top of the Ethereum network, we often experience a high level of network activity, causing congestion and the extremely high gas fees we’ve become unfortunately used to. Ergo, the Ethereum blockchain needs scaling solutions. It’s worth noting that ETH 2.0 (i.e. the coming shift to PoS is an L1 scaling solution. Also I know that we aren’t supposed to call it 2.0 anymore but sue me).

Source: Blockchain-Comparison

Layer 2 solutions seek to improve scalability on top of the Ethereum blockchain, not internally. L2s can generally be grouped into sidechains, channels, and rollups.

Sidechains: A sidechain is a side blockchain that is linked to another blockchain, referred to as the main chain, via a two-way peg. The two-way peg enables the interchangeability of assets at a predetermined rate between the parent blockchain and the sidechain. The original blockchain is usually referred to as the ‘main chain’ and all additional blockchains are referred to as ‘sidechains’.

One of the first benefits of sidechains is the ability to have a faster mainchain, as transactions can take place on either of the sidechains. If developers are dissatisfied with the costs and transaction speed of the mainchain, they can deploy their dApp on one of the sidechains. Second, sidechains allow new and potentially unstable software to get deployed and tested on said chain. If in case, the software has functional or security issues, the damage is contained within the sidechain. Third, sidechains can lessen the burden of the mainchain, as they can store data and process transactions, thus, maintaining the integrity of the mainchain while making it smaller and faster.

The below diagram details this relationship at a high level:

Source: Moralis

For example, Polygon is an interoperability L2 scaling solution for building EVM-compatible blockchains. Its native token — MATIC — is used for governance, staking, and for paying gas fees. Polygon uses a proof-of-stake (PoS) consensus mechanism.

Channels: Channels are a technique for performing transactions and other state transitions in a second layer built on top of a blockchain. State channels make blockchains more efficient by moving many processes off-chain, while still retaining a blockchain’s characteristic trustworthiness. Channels permit reduction of the load and transaction cost on L1 by allowing to transact many times off-chain (L2) while submitting two transactions to the network on-chain (L1). This is possible by a 2 transaction workflow where 1) The first transaction opens the connection. Participants must lock a portion of Ethereum’s state into a multi-sig contract, and then 2) The second transaction closes the connection. When the transaction is finished, a final on-chain transaction is submitted and unlocks the state.

The channels workflow can deal with payments (payment channels) or general state updates and computations (state channels).

Channels are a great solution for building instant withdrawal/settling on the mainnet. Moreover, it leads to high throughput and extremely low costs. On the other hand, time and cost to set up and settle channels can be high, and long exit times are expected if members do not reach a valid exit state.

For example, Raiden Network is an off-chain transfer network for Ethereum ERC20 tokens. It provides a fast, scalable, and cheap alternative to on-chain token transfers. At the same time, the network transfers provide users with guarantees of finality, security, and decentralization similar to those known from blockchains.

Rollups: A simple definition of rollups might be as solutions where transactions are put on chain but other things like transaction processing and state storage may be off chain. The name references the fact that they “roll up” transactions and fit them into a single block. In a rollup, calls to smart contracts and their arguments are written on-chain as call data, but the actual computation and storage of the contract are done off chain. Per Vitalik, “The result is a system where scalability is still limited by the data bandwidth of the underlying blockchain, but at a very favorable ratio”. The benefits are generally higher tps and lower transaction fees, faster transaction confirmations, and better security (as they rely on the security of the Ethereum blockchain, not their own). With that said, rollups pose some challenges in composing transactions utilizing multiple protocols, diminished liquidity, and higher fees compared to sidechains.

Rollups are typically broken into Optimistic Rollups and ZK-Rollups.

Optimistic rollups: Optimistic rollups assume that the data submitted to the Ethereum network is correct and valid, hence the name. The goal of optimistic rollups is to decrease latency, currently limited by Ethereum’s block time of ~13 seconds, and increase transaction throughput, thereby reducing gas fees. With optimistic rollups, the actual computation and storage of the contract are done off-chain. As they don’t actually compute the transaction, there is a need for a mechanism to prevent fraudulent or invalid transactions. Whenever there is an invalid transaction, there is a dispute resolution. A party submits a batch of transaction data to Ethereum, and whenever someone detects an invalid submission, they can offer “fraud proof” against that transaction. In that event both parties, the one submitting the data and the one submitting the fraud proof, have their ETH staked. As such, the party in the wrong loses their ETH. Whenever a fraud proof is submitted, the suspicious transaction is executed again, this time on the main Ethereum network. To ensure the transaction is replayed with the exact state when it was originally performed on the rollup chain, a “manager” contract is created that replaces certain function calls with a state from the rollup.

Examples: Optimism, Arbitrum.

zk-rollups: zk-rollups or Zero-Knowledge rollups, unlike Optimistic rollups, do not have a native dispute resolution mechanism. Instead, they utilize a piece of cryptography called Zero-Knowledge proofs. Zero-knowledge proof technologies enable one party to prove to another party that they know something without the prover having to convey the information itself in order to prove their knowledge.

Two of the most compelling zero-knowledge technologies in the market today are zk-STARKs and zk-SNARKs. Both are acronyms for the method by which the two parties prove their knowledge: zk-STARK stands for zero-knowledge scalable transparent argument of knowledge, and zk-SNARK stands for zero-knowledge succinct non-interactive argument of knowledge. While a detailed discussion of the differences here is beyond the scope of this piece, a breakdown of the differences by Consensys can be found here. It’s worth noting that zk-STARKs were created as an alternative version of the zk-SNARK proofs and are considered a faster and cheaper implementation of this technology, and are generally utilized more often today.

In either model, every batch of transactions submitted to Ethereum includes a cryptographic proof verified by a contract that is deployed on the Ethereum main network. This contract maintains the state of all transfers on the rollups chain, and this state can be updated only with validity proof. This means that only the validity proof needs to be stored on the main Ethereum network instead of bulky transaction data, thus making zk-rollups quicker and cheaper as compared to executing on the main network.

Examples: Loopring, STARKWARE, zkSync.

Sharding: While strictly speaking, sharding isn’t a layer 2 scaling solution, it’s worth touching on briefly. On its own, Ethereum transactions must be processed by all the nodes in the network. Because each node must independently verify every transaction, processing time can run extremely long. Sharding avoids this by relying on multiple networked machines that distribute the computational burden. By dividing the network state into smaller sections or partitions called “shards”, with each running a smaller scale consensus protocol, transactions can be processed in parallel. Rather than having all transactions computed on the base layer, a network of shards process transactions simultaneously and communicate with the root chain (base layer). Then, a contract on the main chain coordinates validation for the separate shards. In short, this process of partitioning and parallel processing drastically reduces transaction times. Celestia provides modular blockchain infrastructure utilizing this concept.

Access Layer

Phew, almost at the top of the stack! We aren’t going to spend a massive amount of time here, but the access layer consists of the applications and projects that end-users use to interact with blockchains, for example, wallets like Metamask, Coinbase Wallet, or Rainbow. Additionally, this layer includes use-case-specific applications like NFT marketplaces (Opensea, Rarify, etc.), DeFi projects (Uniswap, Aave, etc.), and identity services (ENS, and so on).

Fin

In close, blockchain has come a long way since the days of cyberpunk programmers and use cases confined to purchasing illicit substances. From my parents asking me to explain Ethereum to billion-dollar businesses being built virtually overnight, it’s clear that blockchain is going to be a material part of the world of technology going forward. However, it has a long way to go still, with infrastructural needs that have yet to be met for financial institutions, traditional enterprises, programmers, and retail consumers. To shamelessly steal from The Block, “if data is the new oil, and blockchain is the foundation for the new internet, then the companies laying the infrastructure and establishing processes to procure and refine digital assets data are uniquely positioned to capitalize on the continued expansion of the next web: the value of which is priceless.”

On the note of sources, I’ve done my best to source proactively throughout the document, however, if you feel that there is anything I’ve gotten wrong or would like to find the source for any particular piece of information that isn’t linked directly, please feel free to reach out to me on one of the below channels. Also, if you want to shamelessly steal the market map source file to use in a presentation, etc., please let me know and I can send it to you directly.

And last but not least, thank you so much for reading! It’s my hope that this was to some extent informative and entertaining, and if you’d like to further double click on any one topic, my DMs are always open.

Twitter: @dhafford2

Email: dave@morpheus.com / davehafford@gmail.com

Telegram: @dhafford

Discord: Airless#2604

--

--

Dave Hafford

Dave is an investor at Thomvest Ventures, focused on opportunities within the fintech vertical. Prior to that, he was a full-time MBA at Wharton.