Towards Massive On-Chain Scaling: Block Propagation Results With Xthin

Part 2 of 5: Xthin blocks are faster than standard blocks

By Andrew Clifford, Peter R. Rizun, Andrea Suisani (@sickpig), Andrew Stone and Peter Tschipper. With special thanks to Jihan Wu from AntPool for the block source and to @cypherdoc and our other generous donors for the funds to pay for our nodes in Mainland China.

In Part 1 of this 5 part series, we discussed the motivation behind Xthin and the methodology we used to compare its performance to that of standard block propagation. This post delves into the results: it compares the length of time required to propagate standard blocks to the length of time required to propagate Xthin blocks, for nodes connected across the normal P2P network.

Fig. 1. This post analyzes the block propagation times for data points that fell into Bin 1 and Bin 2.

Blocks transmitted through the Great Firewall of China (GFC) were not considered (Part 3 explores the effect of the GFC)

Removing 7 outliers caused by two bad peers connected to the Shenzhen node (affecting only standard blocks), and removing all blocks relayed from AntPool to the Shanghai node (we learned that AntPool’s node was located in the same data center), left 1481 data points in Bin 1 and 4464 data points in Bin 2. This is the data set analyzed in this post.

All blocks had an uncompressed size between 900 kB and 1 MB.

Log-normal propagation time distributions

A histogram of the block propagation times for bins 1 and 2 (combined) shows what superficially appears to be an exponential distribution with a fat tail. Most blocks are received in under a few seconds, with some blocks taking most of the 10 minute block interval.

Fig. 2. Histogram of block propagation times (Xthin and standard blocks together) on a linear time scale. The bulk of the data is squeezed on the left and the data for block times greater than a few seconds is barely visible.

With respect to on-chain scaling, the concern is not the large population of blocks that propagate quickly — taking a few seconds to download a block every 10 minutes is hardly a worry for a non-mining node. The concern is rather the smaller fraction of blocks that propagate slowly, which are not visible if propagation time is plotted on a linear scale.

By plotting propagation time on a logarithmic scale, a different picture emerges. Separating out Bin 1 from Bin 2 reveals two distinct populations, each with its own approximately log-normal distribution. The advantage of the log scale is that it captures on a single chart both the fast blocks that propagate in less than a second, and the very slow blocks—including some that take several minutes to propagate.

Fig. 3. Smooth histograms for block propagation times (single hop, log scale) over the normal P2P network for Xthin (top) and standard blocks (bottom). The box-and-whisker plots show the median (green line), the interquartile range (box body), the extremes (whiskers), and near and far outliers (black/gray). In this context, outliers are defined simply as points more than 1.5 times the interquartile range above the 75% quantile or below the 25% quantile. These outliers were NOT removed when calculating mean, median and the 95th percentile.

There is significant area under the Xthin distribution for propagation times less than 0.2 seconds, while the area under the standard distribution in the same region is minimal. This shows that a non-negligible number of Xthin blocks propagated at speeds that were not attainable by standard blocks. The Xthin distribution peaks between 0.5 s and 0.7 s while the standard distribution peaks at a slower 1.5 s to 2.2 s — these values represent the most common propagation times for Xthin and standard blocks, respectively. The significant area in the tail of the standard distribution above 3 s reveals the tendency of standard block propagation towards a small but still-significant number of slow blocks; comparatively, the Xthin distribution has a much “thinner tail.” The figure above also includes box-and-whisker charts for both bins, which highlight the median propagation times (marked by the light-green line).

Statistical summary

Xthin blocks were 12 times as fast as standard blocks with respect to the mean (0.7 s vs 8.4 s), just shy of 3 times as fast with respect to median (0.5 s vs 1.4 s), and 8 times as fast with respect to the 95th percentile (1.3 s vs 10.9 s).

Fig. 4. Propagation time statistics for 900 kb — 1 MB blocks.

None of these numbers tell the complete story. However, if you wanted to estimate how long it would take to download 1000 blocks, the best number to use would be the mean (multiplied by 1000 of course). In fact, it is the mean propagation time that determines whether a node will be able to keep up with the blockchain, or whether it will fall further and further behind. The mean is thus the most relevant number in the context of the Cornell study discussed in Part 1. However, the mean is sensitive to outliers (e.g., very slow blocks may be under or over represented in any of the bins) and thus finicky with respect to the experimental setup.

The median describes the time where half the blocks were received faster and half the blocks were received slower. The median is useful because it is robust to outliers; however, it significantly understates the contribution of blocks with long propagation times.

The 95th-percentile is a useful statistic because most blocks were received in less time than it (only 5% of blocks took longer) while still remaining reasonably robust to outliers. It captures part of the effect of very slow blocks. However, it is not a true average.

Are the results statistically significant?

The null hypothesis — when applied to this experiment — asserts that Xthin has no effect on block propagation time. Indeed, it is possible that the difference in the average propagation times between bins 1 and 2 was a result of luck (e.g., due to an insufficient number of data points). To determine how probable this was, ANOVA was used, yielding a p-value of 3 x 10^-329 (with respect to the geometric means). This implies that obtaining these results without a real effect being present was highly improbable. We thus reject the null hypothesis in favour of the hypothesis that Xthin blocks are faster than standard blocks.

How often is a second round trip required?

A second round trip is occasionally required in order for the receiving node to fetch transactions it needs that were not included in the thin block. These events typically correspond with Bloom filter false positives. In this study, a second round trip was required 1.5% of the time (66 occurrences in Bin 2). The mean, median and 95th-percentile for propagation time when a second round trip was required were 1.3 s, 1.0 s, and 2.0 s, respectively. This compares to 0.7 s, 0.5 s, and 1.3 s, when only a single round trip was required.

Part 3 of 5: Xthin blocks are less affected by the Great Firewall of China

In our next post, we examine the propagation times for both Xthin and standard blocks passing through the Great Firewall of China.

Download Bitcoin Unlimited

You too can help improve network block propagation by downloading and running Bitcoin Unlimited today [link].

Copyright

This document and its images are placed in the public domain.