Death by a thousand cuts
In the previous chapter, we outlined a just two configurable variables can be used to prevent a network node from taking too long to validate a block or a transaction chain. This seems very much to be a trivial thing, however, let us remember that Bitcoin version 0.1 was unbounded in its limits.
OP_RETURN for example which is simply some metadata attached to a transaction had no limit other than the limit of the FAT32 file system. In 2009 that limit was 4GB. This is not a limit of the Bitcoin protocol or implementations but a limit of the file architecture used at the time. An additional limit to the size of OP_RETURN was introduced at some point in Bitcoins history. It doesn’t matter who put the limit in or when it was added. It is a limit that must be removed completely. If it is not removed the discussion of not increasing it will undoubtedly be had again. While this limit does not break anything and is simple to get around it makes using Bitcoin more inconvenient than it needs to be. There should never be a limitation of the Bitcoin protocol. It should be as it was intended in version 0.1.
This is just one example of the many changes that were introduced into the Bitcoin implementation as a “softfork” that will have to be removed for Bitcoin to fulfil the potential it holds. If we don’t permeate an unbounded kind of thinking throughout the Bitcoin Cash population the arguments over what the protocol can do will never stop. People will always say, “the nodes might not be able to process it in time.” To this, the only acceptable answer is “So what?” Someone had to be able to produce it. If the validation calculation completes and it is found to be invalid then it is ignored this is a given and everyone understands this. However, we can take this further so that if a node cannot process something they simply ignore it.
With this in mind let’s talk about the present.
The White Rabbit
The white rabbit in Alice in Wonderland is obsessed with time. Always running late and always nervous about the repercussions of his tardiness, the rabbit must always know the time.
Miners are much like the white rabbit. They must always be aware of the time at which transactions are seen on the network. If they get this timing wrong their block of transactions may be seen as invalid. To prevent a transaction from becoming invalid, miners enforce but one primary rule:
Transactions only spend unspent transactions.
To know which transactions are unspent they must first know which are spent. Thus they must know the order.
To ensure that orders are somewhat similar to their competitors they propagate each transaction to all their peers as soon as they receive it. They do this because if they receive and process a transaction that others do not have when they find a valid solution to the proof of work and propagate it they must also send additional transaction data. Their aim is to reduce the amount of transaction data that they communicate along with a block so as not to waste time to the lag inherent within the network and between nodes.
We must remember that any amount of time wasted during the transition from one block to the next is potentially detrimental to each individual as it causes a loss in revenue, but not the system as a whole. If a single or a multitude of miners drop from the network the network continues, this is the nature of Bitcoin being a distributed system.
A blockchain is a time-ordered system; the white paper describes Bitcoin as a timestamp server for good reason. For any system dealing with financial transactions, we must know the order in which they occurred. The reason for proper ordering is illustrated below where a simple system with two accounts swaps balances just three times.
Even in our extremely simple example, the order of the actions is of critical importance. There is a valid order but also invalid ones. This becomes quite intuitive if we simply swap the order of the last two transactions. This is illustrated below and shown in topological order.
Fortunately, the fundamental nature of bitcoin will allow transactions within a block to be in any order as it does not invalidate the actual transaction chains. If the transaction chain does not try and spend the same coins twice, an empty output or more coins than an output contains, a valid transaction chain can be found. The issue may come when reading the transaction chain as some user may appear to be spending an what could be seen as an empty output. If the node were to just continue reading the full block they would eventually find the connecting transaction and thus the chain links together.
The table under the tree
The architecture of Bitcoin locks is designed such that the Proof of Work mechanism covers the full contents of the block so that nothing within the block can be changed without invalidating the Proof of Work. It does this via a structure called a Merkle tree. The Merkle tree is a series of hashes that produces a 256-bit string. The end of each hash chain contains the transaction. This is outlined in Understanding Concepts which is one of the first posts in this series. These structures will also be talked about more in a later chapter. For now, all one must know is that the is data structure is deterministic in nature. That is if the same input is given the same output will always result and thus does not need to ever be communicated between miners.
Order of the table
Understanding the above we now know that if a node knows the order of transactions that the everyone else has the amount of data can be vastly streamlined. But what are the implications of such a concept? If we get nodes to agree on an order of transactions for the block before the block is found how are we affecting the system as a whole? To understand this let’s take a look at two proposals.
Topological ordering describes an order of transactions that only has one constraint, a child transaction cannot come before a parent. Due to the nature of constructing a transaction that has value to its participants a transaction must be know so that it is used in the next transaction. This means that the vast majority of the time a miner is able to add a transaction to their mempool in the order they are first seen.
There are very few instances where a child would be seen before a parent, however, even if this is the case a miner should not have been working on the transaction since it will be utilizing invalid inputs without the parent transaction. The method of rectifying this potential risk is to simply store the invalid transactions until there is a parent transaction that makes its child valid.
Principally this requires little to no processing time as when receiving and processing a transaction to put into the Merkle tree there is nothing to do other than put the transactions into memory and begin working on them. As discussed in the previous post, the decision to begin working on a transaction is as simple as: does the transaction spend valid inputs, contain a valid script and signature.
In another proposal, transactions are ordered by the transaction ID also known as the transaction hash. This pseudo-random number (looks random but is deterministic in nature since it is a hash) decides the order of the transactions within the block. Using canonical ordering, transactions are essentially ordered in ascending order much like a library or filing cabinet that will be easier to look up items later. This means that transactions must be sorted into an order before they are added to the block in progress.
A note of consequence is that for a set of transactions under this rule set produces an order that is always the same and thus deterministic. In other words, if you receive a set of transactions you know the order in which they must be. That also means that Merkle tree is deterministic when given a set of transactions in any order.
The critical time for any miner is when a block is found. For the miner that finds a block due to their arduous work effort, they must ensure that it is propagated to as many peers as possible as quickly as possible and in a format that is fast to validate. Thus they must also be prepared to give any information requested of them at any time as fast as possible. If they do not do this they are at risk of having their work wasted due to an orphan. So it is this mechanism that requires our focus.
We can now look at how these two different approaches change the economic incentives.
With topological ordering, miners do not need to order transactions when they are incoming but need to know the order and reorder the transactions when validating a competitors block. Miners can work extremely hard validating transactions and fitting in as many transactions as possible into a block whilst mining without worrying about their order. The work they do to propagate all the transactions they have as fast as possible will reduce the work they need to do when validating their competitors block as the differences between the order will be less. Once work begins on the next block they work extremely hard to communicate all information they have with their peers for fear that the block they are currently working will be wasted due to it being orphaned due to a competing block.
In the canonical ordering scheme, miners need to order transactions during the block assembly process. This appears to reduce the amount of work that needs to be done when propagating or receiving a block. However, it may also reduce the number of transactions they are able to process over this period due to the reorganization process taking place as transactions are coming in due to the additional CPU cycles needed. It also means that miners do not need to know the order of the transactions when receiving a block, only the transactions in the set.
All this to say that the approaches only really differ when miners agree on a strict external rule set of the transaction order. If the miners have an external rule that determines the order of the transactions, the work to produce that order is shifted to when the transactions are incoming as opposed to when the block is completed and needs to be propagated and validated. The exact mechanisms of how nodes accomplish this validation once a block comes in will be covered in a future post.
We march onward
Although it may seem trivial and insignificant, the rule set of transaction ordering does have an economic effect on the overall system and not just for individual miners. This effect is reducing the work that miners have to do during the receipt of a block. It also reduces a miners ability to make blocks as large as they possibly can as it moves the bottleneck from the validation of the block to the creation of the block. In other words, the throughput of the whole blockchain is being hindered due to the additional CPU cycles required when constructing the block. It is this work effort to both validate and produce blocks as quickly as possible that currently forces miners to connect directly with other miners; any intentional reduction in this work effort may be a net negative to the economic system that is Bitcoin as a whole. In other words, the economic forces weaken causing miners to become more loosely connected. It is a miners job to ensure that they can keep up with 51% of the rest of the hash rate.
The idea of removing an element of time from a time-stamp-server is a curious thing to want to do. It is this question that encouraged me to look into this topic in more detail as it should encourage the reader as well.
Alex Fauvel, follow me on Twitter to get notified about my next post and for the latest news and developments happening in Bitcoin cash land.
All posts are never intended to be financial advice and are for informational purposes only.
If you have any questions relating to Bitcoin Cash please get in touch at firstname.lastname@example.org and I will do my best to get back to you.