New Maximum EOSIO TPS demonstrated in Jungle Testnet: 9179 (edit 1)

中文版 (Chinese Version)

The pre-release of EOS Dawn v2.0 was published one year ago on December 5th, 2017.

https://github.com/EOSIO/eos/releases/tag/dawn-v2.0.0

Recently, group of Block Producers tested the limits of EOSIO on the new Jungle Testnet 2.0. Series of tests were run on Wednesday Nov 28, Thursday Nov 29, and Monday Dec 4.

Teams involved (in alphabetical order): Attic Lab, BlockMatrix, CryptoLions, EOS Barcelona, EOS Rio, EOS Metal, eosDAC, SW/Eden.

Why?

  • To try a new methodology for more intensive stress testing.
  • To test adjustments to different global variables.
  • To push the limits of EOSIO software.

How?

The new testing methodology involved strict clock synchronization of “spamming” nodes, and separating out the preparation and packaging of transactions, from their broadcast over the network. Testers referred to this as “loading the gun.”

These global variables which were adjusted during the testing:

  • min_transaction_cpu_usage
  • max_block_cpu_usage

as were these Nodeoes parameters:

  • chain-threads
  • last-block-time-offset-us

We also attempted to test the impact of deferred transactions.

Lastly, we adjusted our voting so that only the strongest servers would be producing blocks in Jungle Testnet during this testing.

Some of the Infrastructure for stress party

EOSMetal

One BP node with 64GB and i7 Intel processor

One fullnode with 64vcpus and 480GB of RAM, running 10 nodeos process

One Gun server with 64vcpus and 480GB of RAM, for shooting

Attic Lab

One BP node with 64GB and Intel Core i9 9900 processor

Two fullnodes 64 GB and Intel Core i7 8700 processor

CryptoLions

Intel Xeon W-2145 Octa-Core, 128 GB RAM DDR4 ECC, SSD HDD, 1 GBs Internet and a second such server with a NVE HDD

Testing Journal

Day 1 (Wednesday, Nov 28th)

New Ideas:

  • Synchronization of “spammers” and scripts that formed and packed transactions in preparation to sending them. (We had each spammer send 10 packed packages of 1000.)

Then cleos push transactions.

  • Use Ntp daemon to synchronize time. The spam would come simultaneously for all spammers as pulses at 5-minute intervals — which, to the great delights of the BPs allowed dramatic countdowns and hilarious gifs.
  • Created scripts to send massive transfers with blank actions.
  • We staked 200k EOS to all the spammers, then when it proved insufficient, 400k EOS.
  • On the first day of testing, we could not break the transaction limit off 1998 per block, or 3996 TPS. This number as long since been the maximum TPS detected by the EOS Network Monitor (https://eosnetworkmonitor.io/). We even speculated that there might be some hard coded restriction limiting this.

EOS Sweden

EOS Barcelona

EOS Rio

Attic Lab

EOS Barcelona — TPS = 3992

Attic Lab — TPS = 3996

  • We were disappointed to only match and not exceed to historical Max TPS of 3996

Toward the end of the first day, we upgraded to v1.5.0-rc1.

Ihor from EOS Rio suggested: and add chain-threads = 8

Day 2

We experimented with these global variables:

min_transaction_cpu_usage

max_block_cpu_usage

Bohdan CryptoLions, [03.12.18 22:57]

1. min_transaction_cpu_usage: 100 -> 80 -> 70 -> 60

2. max_block_cpu_usage: 200000 -> 300000 -> 350000 -> 400000

3. both 1 and 2

The default global values are these:

{
“max_block_net_usage”: 1048576,
“target_block_net_usage_pct”: 1000,
“max_transaction_net_usage”: 524288,
“base_per_transaction_net_usage”: 12,
“net_usage_leeway”: 500,
“context_free_discount_net_usage_num”: 20,
“context_free_discount_net_usage_den”: 100,
“max_block_cpu_usage”: 200000,
“target_block_cpu_usage_pct”: 2000,
“max_transaction_cpu_usage”: 150000,
“min_transaction_cpu_usage”: 100,
“max_transaction_lifetime”: 3600,
“deferred_trx_expiration_window”: 600,
“max_transaction_delay”: 3888000,
“max_inline_action_size”: 4096,
“max_inline_action_depth”: 4,
“max_authority_depth”: 6}

This graphic shows a behavior about 15% less exec time after atticlab updated to 1.5-rc1? or maybe it’s coincidence…

And then the fun began again:

Big latencies reported by some BPs:

We tweaked this nodeos parameter to close the last block faster, thinking it would help the next producer synchronize faster:

last-block-time-offset-us = -300000

Blocks 1025381 and 1025382 contained our maximum TPS — 9179 transactions.

Though there were forks which occurred around these blocks. We also achieved a completely clean 6977 TPS.

The record we set for transactions in a single block was 6101. (Block number 1025825).

Replaying worked like a charm too. (Replaying difficulties were why we launched Jungle 2 https://monitor.jungletestnet.io/ and deprecated what we now call Jungle Classic http://jungle.cryptolions.io/ .)

Day 3

Setting the Agenda:

Eric — sw/eden, [29.11.18 20:26]

1. Block Twitter spam txs max?

2. One or two free spaces between each procuder

3. Bnet

4. Deferred tx

5. Different params

What else?

Michael Yeates, [29.11.18 20:26]

deferred transactions

Bohdan CryptoLions, [29.11.18 20:26]

repeat tests for different global params

Bohdan CryptoLions, [29.11.18 20:27]

to test deferred tx we need to create special contract..

And it seemed right away that we had set a new TPS record, that was half-again as high as the 9179 achieved on Day 2:

So the maximum number of transactions in a single block appeared to be 13,965.

But something was strange — actions were low despite transactions were high.

(This was eerily similar to the poisoned blocks in the “Jungle Classic” Testnet which contained transaction information without transaction bodies.)

And after a little investigation we found that blocks were filled with expired transactions:

So what do we call this? Two blocks produced in one second did indeed contain 16,992 transactions, but they were expired transactions.

Is this a new TPS record? Or maybe an “eTPS” record?

Incidentally, the Jungle Testnet monitor (https://monitor.jungletestnet.io/) automatically caught this number as a new record. CryptoLions will manually change it to 9179, and move the eTPS record into the Legend for future reference.

Lessons & opportunities:

  • Increasing max_block_cpu_usage parameter from 200k to 300k to 400k increases forks, but doesn’t break the chain.
  • Opportunity: Capture and study # of lost transactions and try to determine optimum setting under various conditions.
  • Lowering min CPU showed good results, increases to 5 or 6k tps.
  • The perceived TPS limit of 3996 which was demonstrated in the EOS Mainnet and encountered on Day 1, seemed to be a result of throughput bumping against min_transaction_cpu_usage.
  • Hopefully, the data from these tests, and well as the information captured in the Jungle 2 blockchain will be useful to core developers.

Other Screenshots and Gifs:

Screenshot from EOSMetal preparing for push 2k *10.000 transactions using server with 64vcpus and 480GB of RAM. But does not work because is limited for 1k push actions (max elements in array).

Some discussion of the testing in this interview:

Thank you to all the teams who supported these tests with their servers, time, good humor, enthusiasm, curiosity, and ideas.

Say what you will about the governance, EOSIO, the technology is working brilliantly, and has lots of unrealized potential.

Edit 1: Qualified the largest block info — “Though there were forks which occurred around these blocks. We also achieved a completely clean 6977 TPS.”