Towards Massive On-Chain Scaling: Block Propagation Results With Xthin

Part 4 of 5: Fewer bytes are required to communicate a Xthin block

By Andrew Clifford, Peter R. Rizun, Andrea Suisani (@sickpig), Andrew Stone and Peter Tschipper. With special thanks to Jihan Wu from AntPool for the block source and to @cypherdoc and our other generous donors for the funds to pay for our nodes in Mainland China.

Note to readers: If you missed Part 3, you can read about the effect of the Great Firewall of China on block propagation times here.

In this post, we complete our analysis of the experimental data by investigating the number of bytes required to propagate Xthin blocks compared to standard blocks.

Fig. 1. Using the Xthin propagation technique cuts the number of bytes required by a factor of 24.

The main result is simple: on average, 42 kB are required to communicate a 1 MB block using Xthin, thereby cutting the bandwidth required for block propagation by a factor of 24 (1000 kB /42 kB).

The remainder of this post explores the relevant data in more detail. We begin by showing that while the Xthin technique considerably reduced the number of bytes required, the presence of the GFC did not change the average compression in a measurable way.

Statistics and Analysis of Variance

The table below shows the mean, median and 95th-percentile for the number of bytes used to propagate blocks for each bin. All blocks had an uncompressed size between 900 kB and 1 MB and the mean size of the uncompressed blocks in each bin was 0.99 MB.

Table 1. Statistics on the number of kilobytes used to communicate a block.

The use of Xthin resulted in significant bandwidth savings (41.3 and 42.6 kB compared to 0.99 MB), and it appears the GFC may have produced a small effect as well. Xthin blocks transmitted over the normal P2P network used 3% fewer bytes than Xthin blocks transmitted through the GFC. To determine whether this was statistically significant, we again performed a 2x2 full-factorial ANOVA, this time on the bandwidth data.

The p-value for the effect of Xthin was significant (p=3 x 10^-8796) however the p-value for the effect of the GFC was not (p=0.4). We can reject the null hypothesis with respect to Xthin’s effect (Xthin does reduce bandwidth requirements [obviously]); however, we cannot do the same with respect to the GFC’s effect (there is insufficient data to determine whether or not the GFC affects the average compression).

Because there was no statitically-significant difference between bins 2 and 4 with respect to compression, the remainder of this post analyzes all of the Xthin data (i.e., bins 2 and 4) together.

Bloom filter and thin block histograms

A smoothed histogram for the number of bytes required to propagate a Xthin block is shown below, along with histograms for its two constituent components: the thin block (including the the missing transactions) and the Bloom filter. Note that the horizontal axis is shown on a logarithmic scale in order to capture the full domain of data.

Fig. 2. Histogram of Bloom filter size, thin block size (including missing transactions) and total size (Bloom filter + thin block). Bins 2 and 4 combined; N=6685.

Box-and-Whisker plots

Box-and-whisker plots reveals the occasional outlier that required signficantly more than 42 kB to communicate. In all cases, this was due to a “thick” thin block and never due to a large Bloom filter.

Fig. 3. Box-and-whisker charts of Bloom filter size, thin block size (including missing transactions) and total size (Bloom filter + thin block).

Do the Bloom filters make a difference?

The purpose of the Bloom filters is to make the transmitting node aware of the contents of the receiving node’s mempool. This allows the transmitting node to send the transactions the receiving node knows about using a short hash, and send the transactions the receiving node does not know about in full. As was shown in Part 2, this permits blocks to be propagated in 1.5 round trips, 98.5% of the time.

It is interesting to examine how often the mempools are sufficiently homogenous such that the Bloom filter had no effect (i.e., the receiving node was already aware of every transaction in the block except for the coinbase TX [which is always transmitted as part of the thin block]).

The following figure tells the story. Over half of the time (53%), the receiving node was aware of all of the transactions in the block; the entire block was communicated by transaction hashes and the Bloom filter served no purpose. Seventeen percent (17%) of the time the receiving node was missing a single transaction, 9% of the time the receiving node was missing 2 transactions, and so on as shown by the probability density function (PDF) below. In total, the Bloom filter was required to prevent a second round trip 47% of the time (in all cases except for those denoted by the green bar).

Fig. 4. Chart showing the fraction of time 0, 1, 2, …, 10 full transactions were included in the thin block sent by the transmitting node (the coinbase TX is always sent and is not counted in the chart above).

Part 5 of 5: Towards Massive On-chain Scaling

This concludes our examination of the experimental data. Our next and final post in this five-part series will make recommendations for how this technology could be put to better use (note that it is already deployed and successfully running in Bitcoin Unlimited 0.12), as well as a provide a “sneak peek” inside the Bitcoin Unlimited Laboratory.

Download Bitcoin Unlimited

You too can help improve network block propagation by downloading and running Bitcoin Unlimited today [link].


This document and its images are placed in the public domain.