Exploring the Ethereum Blockchain with QuickBlocks Part 2

Exploring the Capabilities of QuickBlocks

Stephen Jude
Sep 1, 2018 · 7 min read

Now that I could compile and run my programs (see Part 1), it was time to learn what the various classes and their methods could do. This article will not go into depth on every class or method. To get an understanding of QuickBlocks structure and its various classes, see their documentation (some the examples are outdated since they upgraded to v0.5.5alpha). There are simply too many to discuss here and frankly my knowledge on all of them is still very limited.

Before I dive into writing a quick program, I must highlight the fact that the command line tools that come with QuickBlocks are very useful. They allowed me to do a quick glance at the data. For example, if you want to figure out how many transactions are in a certain block, you can run the CLI command before you ever write your program. Take some time to explore the tutorials hidden in the QuickBlocks directories on how to use the CLI commands. They can be quite powerful. The screenshot below shows three such examples. Note the ethName command: this is very useful.

CLI examples

When it came time to write a simple program, I found myself with almost too many options. There are a ton of options available to you with QuickBlocks. But it can be a bit overwhelming, especially with limited programming skills. Since the software is in alpha, I knew that the documentation would be lacking, which was certainly the case. Overall, this was probably my biggest frustration with the whole experience, next to how long some programs took to run. But the Slack channel was open and very helpful for a beginner such as myself.

Step 1: I knew that in order to get the speed and efficiencies that QuickBlocks is supposed to provide, I needed to create a cache of the blockchain. QuickBlocks has the ability to take each block and save it in a binary format that makes searching the blocks easier. The code to cache each block in binary format is shown below.

Code to cache each block in binary format

Note that this is a brute force method of caching each block in its entirety. It does not do any optimization such as storing the blocks as Enhanced Bloom Filters. Interpretation: it takes a long time to run this program if you plan to cache a large number of blocks. See Conclusion 1 below.

Step 2: After figuring out how to cache the blockchain data, I turned my attention to trying to solve the bounty that Thomas Jay Rush posted on Git Coin. I used this as a homework assignment to learn how to use the software. In its essence, the task asked you to figure out which transactions involving the Ethereum Tip Jar address were missing from a given list. I figured this would be a good test of my capabilities as well as my understanding of the software.

I determined that the Account Scraper example found on Medium would solve the problem. I had to tweak the example slightly since some of the method names and function calls had changed but in the end it was a fairly straight forward fix. Here is the updated account scraper code written for the Ethereum Tip Jar address:

Ethereum Tip Jar Scraper Program
Tip Jar Scraper Output Example

This quick overview hopefully demonstrates how to quickly generate blockchain explorer tools with QuickBlocks. However, QuickBlocks has much more powerful tools than are described here. These more advanced tools are worth their own separate post, which I may explore doing in the future when I have a better grasp on them. But as a quick example, QuickBlocks has the ability to create smart contract monitors that use the contract’s ABI to return articulated data about a contract’s transactions. Simply put, instead of reading raw hexadecimal inputs to smart contracts, you can use QuickBlocks to generate a program that will decipher the ugly hexadecimal EVM code into human readable function calls! Look for another post in the future detailing this process.

Conclusions and Lessons Learned

Lesson Learned #1: Running a full archive node and searching its data is extremely computational and hardware intensive! Through a lack of understanding on my part, I only ran a “tracing” Parity node, not realizing that if I wanted the full blockchain history that I would need to run a tracing and archiving node. This was somewhat fortuitous because a tracing node only requires about 140 GBs of memory while a full archive node takes up a whopping 1.4 TB at the time of this writing. This would have crushed my 1TB SSD laptop.

For those who want a full history of the blockchain you are going to need a 2TB SSD at a minimum (normal HDDs are too slow and you will never synch the blockchain). More likely you will need a larger SSD as the blockchain continues to grow in size. There are external SSDs this size that you can purchase but they aren’t cheap. It begs the question, what do you do when Ethereum decides to shard and there are over a thousand chains of this size!

Also, caching every block took a long time. My laptop has a 2.8 GHz Intel Core i7 processor with 16 GB of 2133 MHz DDR3 memory. It took multiple days to cache the entire blockchain. Running the Tip Jar Scraper program took on the order of multiple days to run as well. My guess is that my program executes a brute force search of each block for the address of interest and this severely lengthens to time to run the program. This leads to Lesson Learned #2.

Lesson Learned #2: The open source features of QuickBlocks hint at where the real efficiencies and speed ups are to be found. The documentation talks about an application called blockScrape which optimizes how each block is stored on the local device through algorithms called Enhanced Bloom filters (plus other optimizations…see their White Paper). This should drastically speed up the time it would take to run a program like the Tip Jar Scraper. These features are not part of the open source release yet. Hopefully they become open source in the future and I am curious to see how much faster these programs can run!

Lesson Learned #3: Running your own node and doing very simple data exploration is extremely valuable. Sometimes it is tough to figure out which questions to even ask when starting out with a data analysis project. Spending time exploring the raw data in different ways may reveal something previously unknown and show you which questions you should be asking.

Take for example the Tip Jar Scraper. I decided to put a breakpoint in the for loop to see what caused the program to take so long. By doing this, I noticed that some blocks were processed extremely fast while other blocks took much much longer. I had no sense for what could be causing this discrepancy so I decided to look at the data of each block before it was searched for the Tip Jar address. I noticed a couple of things immediately.

The first trend was the low number of transactions associated with long search times. Blocks with over 60 transactions would process extremely fast while blocks with a measly 4 transactions would take forever (ok maybe not forever, but dozens of seconds at least). The second correlation was that these blocks typically were near the gas limit. Third, the blocks numbers corresponded to the time when the blockchain was being spam attacked. A quick check on a couple of the addresses associated with the transactions confirmed that some of those transactions were indeed associated with the spam attacker. Interesting! In a round about way I managed to learn a little about how the attacker was able to spam the network! While a detriment to my ultimate goal, I can now modify my program to skip searching transactions that are associated with the spam attacker and hopefully gain some some efficiency and speed.

Closing Remarks

I had a lot of fun learning a little about the true nature of how a blockchain like Ethereum truly works. I have a greater appreciation for the engineering and design choices that go into something as complex as a “world computer.” QuickBlocks shows great promise in empowering those who need actionable data.

As a note, Thomas Jay Rush asked if I’d be willing to document my experiences via this post in exchange for a Git Coin bounty. I agreed and took on the task of describing my experience with extreme candor. Throughout this journey, I often attributed any frustrations to a lack of experience on my own part. A more seasoned developer will have much different experience and opinions than myself.

Coinmonks

Coinmonks is a non-profit Crypto educational publication. Follow us on Twitter @coinmonks Our other project — https://coincodecap.com

Stephen Jude

Written by

Coinmonks

Coinmonks

Coinmonks is a non-profit Crypto educational publication. Follow us on Twitter @coinmonks Our other project — https://coincodecap.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade