Validating Consensus: Ethereum vs. Bitcoin

Published in

Musings by Arvinda

10 min readSep 15, 2019

Understanding what makes a full node

There’s been a lot of back-and-forth recently on what exactly makes up a “full node” in Ethereum. Criticisms leveled come mostly from folks familiar with Bitcoin, where there is a very clear idea of how consensus gets built and the types of nodes that allow you to trustlessly participate in this consensus.

In Bitcoin, the only way to be guaranteed that you have followed the maxim of “don’t trust, verify” is to bootstrap your own node from scratch and to verify every single transaction that has ever been sent on the Bitcoin blockchain. These transactions are sourced from special nodes on the network called “full nodes” that maintain a copy of every single Bitcoin transaction and that serve these up to peers seeking to validate Bitcoin’s single history and build their own fully validating node instance. No trust is required since the only goal of a node is to build & validate the longest chain of blocks it can see on the network.

From this perspective, the minimum type of node required for rebuilding state and maintaining consensus is one that holds every single Bitcoin transaction in some database, namely a full node (or full archival node).

In Bitcoin, a full node is at the same time “the fullest” node a person is able to run and “the minimum node required” for maintaining trustless consensus. It has with it an associated size that is treated as synonymous to the size of the Bitcoin blockchain.

There are also other less intensive ways to run a node. These are seen as valid compromises that serve a certain purpose but that by themselves cannot guarantee trustless participation in the network’s consensus. Examples of these node types include:

pruned nodes, where all transactions are verified and tracked within the block hashes while the actual transactions are thrown out. These nodes go through the entire verification process, but cannot help to bootstrap new participants to the network.
SPV nodes, where only block headers are verified, and partial blocks are received only if a relevant transaction is present in that block. These nodes rely entirely on other nodes in the network to maintain correct consensus.

The contention

It is fairly straightforward to understand what exactly makes a node a “full node” within Bitcoin, but this distinction becomes a lot less certain when looking at Ethereum.

*3 node types in Ethereum: Archive, Full/Fast, Light (2017, source*)

The heuristic usually used for determining what is a fully validating node is to simply look for the fullest node one can run on the Ethereum network. This node setup is assumed to be the one that’s the minimum required for trustlessly maintaining consensus.

Within Ethereum however, there is an intermediate level node type that also does the job of maintaining trustless consensus according to the standards expected by the Bitcoin blockchain.

To properly understand this though, we would first need to understand exactly how consensus works and how it is maintained in both chains.

Understanding Consensus

The uncertainty we’ve been hinting at so far is primarily related to maintaining consensus, but what exactly is it within the blockchain that makes up this consensus?

Before we move on it would be useful to loosely define what “consensus” means. For the purposes of this piece, it should be enough to think of consensus as “the state of things that the majority of people within the system choose to accept as correct”.

In Bitcoin, consensus can be independently verified solely by replaying every single Bitcoin transaction that ever was. This verification is tracked as a set of hashes that can be quickly checked at any block height (total number of blocks produced at any given time). Specifically, these hashes are a combination of a hash representing all the transactions in the current block (merkle root hash), and a hash pointer to the previous block that serves as a digest for every single transaction that has come in all the blocks before the current one being examined. These hashes are stored in the block header of each block.

To trustlessly build Bitcoin’s consensus, the only things you’d need are a copy of all Bitcoin transactions and block headers for all blocks created, and these are exactly the things that are stored in a Bitcoin Full Node.

State

Transactions by themselves aren’t useful for a node. Rather, to accurately know which transactions are valid and what the current balances in various addresses are, a node must also track “the state” of each transaction. Specifically the node must track whether a transaction has been consumed by any later transactions which would in effect invalidate that consumed transaction and transfer its balances to the latest unspent transaction in its chain.

“The only transactions that can be validly spent are transactions that have not already been spent by later transactions”.

This in a nutshell is how “state” works in the Bitcoin blockchain.

The state of a transaction is represented as a binary condition of whether it has been spent or not.
The state of any given address is represented by all the unspent transactions under its control. Adding the balance represented by each unspent transaction gives the total balance under control at any particular address.
The state of the entire system is represented by all the unspent transactions that exist at any given block height (any given time). This is what is referred to as the UTXO set in bitcoin where UTXO stands for “(U)nspent (Tx)ransaction (O)utput”.

This state is not explicitly tracked within the blockchain’s database structure. Rather, it is built up separately by each node as the node goes through its initial blockchain re-verification process. It is then stored in a separate `chainstate.db` database locally by the node and serves as a pre-processed reference of the current state of the system.

It’s important to note too that in Bitcoin this state isn’t referenced in any way as a part of the consensus process. Rather, it is simply a parallel process that runs off-chain and that uses the data included in blocks to be generated.

This is an important distinction that we will come across again when looking at how consensus and state are handled within the Ethereum blockchain.

Consensus in Ethereum

Using what we now know about consensus and the Bitcoin blockchain, we can draw some interesting parallels to how consensus is maintained within Ethereum’s block architecture.

Ethereum uses the same basic concepts that exist in Bitcoin for achieving decentralized consensus on an open network. Namely, there are transactions that can update the state of things in the chain, transactions are collected and placed into blocks at certain intervals, and these blocks are then chained together so that all transactions at any point in time can be represented by a simple hash just as in Bitcoin.

Where things diverge are in the extra things that have been added in Ethereum to facilitate their primary goal of being “a more flexible” system that allows for the definition and execution of more complex Turing-complete operations within the system. Naturally, these things come with certain trade-offs that are outside the scope of this piece (an interesting take here).

Block Architecture: Ethereum vs. Bitcoin

To figure out where exactly the differences lie between these two systems, the easiest place to look is to the actual things that get stored within blocks that a node must maintain. Specifically, we can look to the different data structures that come with Ethereum’s blockchain and the digested hash representations of these structures that end up in Ethereum’s block headers.

The most glaring difference that jumps out when looking at these is the way in which Ethereum handles “state” when compared to Bitcoin.

State in Ethereum

Much like in Bitcoin, the state in Ethereum is a representation of the results of processing all prior transactions up to that point in the blockchain. Again, very simplistically the only valid transactions are ones that have not been consumed by later transactions in the chain. In Bitcoin, the only property of an unspent transaction that matters for “state” is the amount of Bitcoins available for future spending within those transactions. These could be easily tracked and worked with in the `chainstate.db` structure that Bitcoin uses, but in Ethereum things gets a bit more complicated.

For Ethereum’s state, it isn’t only the token balances that are tracked from each transaction. Given the nature of how Ethereum works, transactions can also carry much larger payloads that could include things like contract code and associated memory pointers for this code. These additional elements are also important for the state of the system. To facilitate this more complex representation of state, Ethereum’s design makes use of a different data structure that can be easily hashed, tracked and served up on the network when needed as opposed to Bitcoin’s simpler local databases that carry none of these requirements.

Instead of a simple `chainstate.db` file, Ethereum uses a modified Merkle Tree to store the various elements of its state (known as the State Trie), and this entire tree can eventually be processed down to a single hash that gets added to Ethereum’s block headers and becomes a part of the consensus process.

It is this State Trie that is the key to understanding the difference in minimum requirements needed to maintain trustless consensus in Ethereum vs. in Bitcoin.

Ethereum’s Full Nodes

And now we can finally talk about all those different node types within Ethereum.

In Bitcoin, there is only one type of full node that is also sometimes referred to as an “archival full node”.

In Ethereum however, there is both an “Archive Node” and a “Full/Fast Node” and this terminology is where things can sometimes get confusing.

This image again, green rows are “fully validating” (2017 chainsizes)

Much as with Bitcoin, in Ethereum the only things that matter for consensus are the individual transactions that update the chain state and the block headers that represent a digest of the specific way the chain has been updated up until that specific point in time.

Full/Fast Node

A node where the operator chooses to only maintain that transaction record and the block headers, just as is maintained in Bitcoin, is known as a “Full/Fast Node”. This node type is the minimum node configuration needed to maintain trustless consensus on the network and as of this piece stands at ~254 GB (26 GB in the image above on row 5).

Nothing further than this node configuration is needed to maintain the same level of consensus guarantees that a Bitcoin full node will provide.

Archive Node

There is however, a second configuration that node operators can choose to operate that seems to be a sticking point when it comes to validating Ethereum’s consensus rules.

Operators may also choose to save the State Tries generated at each block instead of simply discarding them after processing. Each block’s State Trie is as a result of replaying all the transactions up to that block and it can be generated entirely locally using all the information at hand in the transactions database.

State Tries are not important for maintaining consensus. They are wholly derived from the consensus-critical transactions database contained within blocks.

Much like in Bitcoin, this state is simply a way of storing pre-computed data for quick reference. It serves as a cache of sorts to avoid having to re-compute the state every time it is needed.

Archive nodes are rarely run, and are only needed for instances where an operator needs quick access to all historical states e.g. a block explorer that needs a state snapshot from 2016. Since Ethereum’s state carries much more information than simply its native token balances, the actual states at any given time can be quite large.

This is why an Archive node is so much bigger than a simple Full Node. For context, Archive Node size as of this piece stands at ~2,711 GB (385 GB in the image above on row 1).

This is also why there is so much confusion around what an Archive node is in Ethereum and what its role is in maintaining consensus. Because it is a distinction that is highlighted in the sync options on a node and not obscured away as in Bitcoin, it is far more visible for anyone trying to figure out what makes up each system’s full blockchain size.

Bringing it all home: an experiment

The beautiful thing about the arguments provided here is that because we are dealing with numbers and cryptography, we can explicitly test these different assumptions. Don’t trust, verify. If what was said here were true then theoretically someone should be able to sync an Archive Node using only a Full Node as the source peer, and fortunately Marc-André Dumas has already done this as described in his piece here.

Are Ethereum Full Nodes Really Full? An Experiment.

A recurring Ethereum discussion topic is the storage requirements for running an Ethereum node. Some will say that an…

medium.com

In his experiment Marc was able to sync an Ethereum node from the main network, completely isolate that Full Node from the network after it had fully synced, and then sync an Archive node in isolation solely off of his Full Node within a private network. The Full Node was able to generate a complete Archive Node without any additional inputs or data.

This is interesting because it gives us at least some shared understanding of what is required for a full validation check in Bitcoin vs. in Ethereum. In a nutshell, for consensus purposes it looks like:

Bitcoin Full Archival Node == Ethereum Full/Fast Node

This piece clears up what is required to maintain consensus from a local-storage perspective for each node type. It highlights what specific data is needed locally to bootstrap other nodes in the network in a trustless manner.

An entirely different topic however would be to explore how this validation takes place and what the different sync modes are for each chain. A useful future comparison would be to compare what the validation process is like for each of these configurations, and to determine how suitable or not they are depending on what each community would want out of a decentralized network of fully validating nodes.