Turbo-Geth — beta, constantinopole, tests

It has been a while since I posted any updates. “No news is good news”, I would say. You might have seen the article on Turbo-Geth in Coindesk. I am very pleased with our collaboration with the author on this one. Hopefully it made the idea more accessible.

What is private Beta?

As mentioned in the article, I chose to concentrate on one single user to test-drive Turbo-Geth, especially given that this user runs archive nodes. It will be less stressful for me and more productive for the project to give my time and attention to this, until we decide that level of quality is good enough to recommend it to the general public.

Having said that, the software is open-source, so anyone can clone, build and run it at their leisure. If someone finds a bug, I would be grateful for a report. No promises at all for looking at other requests. Current branch (rebased just today to the master branch of go-ethereum): https://github.com/AlexeyAkhunov/go-ethereum/tree/turbo-geth-9. The default sync mode is “full”, which actually means archive mode. And this is the only mode that works. One flag that will help you sync faster is “trie-cache-gens”. I changed its meaning since “geth” (since I was too lazy to introduce a new one). You can specify how many nodes of trie to keep around. The more — the better. If you have 16 Gb, you can set it to 10000000 (10 million), if you have 26 Gb (like I have on my test machines), you can set it to 20000000 (20 million). You can experiment, by looking at those log lines:

INFO [09–25|11:55:01.273] Memory nodes=2476484 alloc=2807801 sys=8866782 numGC=502

In the line above, there are 2.4 million nodes in memory, and the process takes 8.45 Gb from the operating system.

All RPC APIs should work, though I only tested some of them against geth (debug_traceTransaction, debug_getModifiedAccounts, debug_storageRangeAt, eth_transactionReceipt, eth_getLogs), and they were generally faster, except those that retrieve transaction receipts. This is because Turbo-Geth does not keep receipts in the database anymore, but recomputes them of the fly from the historical state by re-executing transactions in the block. Sometimes it is slower, but within the factor of 2. I will keep optimising those, of course.

There is a great deal of optimisation Turbo-Geth can bring in log (events) filtering, because it can very efficiently figure out when certain contract started to exist (and therefore there is no point in looking into the blocks prior to that). I have not done those one yet, and this is part of the private beta exercise.

Constantinopole

Although after the last rebase, theoretically Turbo-Geth should have all the code for Constantinople, it does not. There is a quirk in the new opcode, CREATE2, allowing the creator of the contract to revive the contract after it has been self-destructed. This is known and has been discussed by the core devs, so do not worry. Because of the way Turbo-Geth persists contract storage, it needs special handling for this, effectively nullifying all the storage after self-destruction.

Next steps

I initially thought I would pursue some features that would allow more users: pruning (which is now almost trivial to implement), warp sync (since fast sync is too hard to support with the data layout).

However, I can see that I will get overwhelmed by this and further work unless I start utilising community contribution and opening up the development to other people. Two prerequisites to that are good test harnesses and good documentation.

Therefore, I decided that I will keep Turbo-Geth an archive-node only for the near future (which will still be useful for the early adopters), and concentrate on test harnesses. It is also a good time to be doing testing — Constantinopole is coming soon.