Misadventures in React with Ethereum
Oh GOD I hate Javascript! The first language I started with 25 years ago was Basic on the Commodore Plus/4 and then I upgraded to an Amstrad CPC 6128. Then at College I learnt Pascal and Assembly, and finally C and C++ at University.
I say this, as I’m a child of the 80’s and 90’s and it was all about procedural and OO languages — and that’s how I’ve intrinsically learnt as the way to write software. Over the years I’ve picked up Python, Perl and Ruby (big fan of Ruby on Rails) to name a few.
I dipped my toe in the water of Javascript (proper, not some crappy JS in a HTML page as I did in the 90s/Naughties) when I started writing an ETL tool for my Machine Learning work, and pretty much made The Top 10 Most Common Mistakes That Node.js Developers make and added a few myself.
I suck as a programmer, but I love to make things, fix things and push machines as far as they go; so I hack at code, I copy and paste and I Get Shit Done. I am Mr Technical Debt and I’m ashamed of it — have been for about 20 years as deep down I wanted to be a leet coder, but my mind doesn’t work like that. My professional career has mostly been at the intersection of developers and business as an Architect, and now I’m a Technical Lead for Open Source at Microsoft. I’m a Systems guy, I look at the big picture when I design, but I understand how things work under the cover.
I highlight this as it is something that eats away at me. I see beautiful code (I know what beauty in code looks like), and I want to be able to think like that and do like that. I can talk with developers, and we can design great systems together but their skill has eluded me.
So I’ve taken something I hacked together a few weeks ago as inline Javascript, and some ETL etc., and I can up with a visualisation of the Ethereum blockchain.
I was impressed by my few hours of work and thought that this was the perfect side project to make this a service and not just Get Shit Done, but do it right.
I first happened upon Angular and was crying internally and externally at how totally alien the whole thing was to me. I asked a few very competent developers whether I should be looking at Angular, Vue or React to crack this nut and apart from a few “Vue is the future” comments, the resounding response was React. I liked this answer as it also meant I could use React Native if I wanted to code this up for mobile too (in past life, I used Appcelerator for my IOS and Android apps).
At Microsoft we are given a subscription to Safari Books Online, and I started reading as much as I could on React. The O’Reilly book Learning React was the one that has helped the most so far as it pushes you very early on the understand Functional Programming.
I have to admit when I’ve hear people talking about FP before I thought they were talking about Procedure and things. I now realise quite how wrong I was and am still totally confused about the “fat arrow” syntax, about why FP is better, and — well there are so many things I’m confused about the whole situation. But I GET that it help me be a better programmer, I sense that if I can get my head around this, I will be a better software designer. No longer shall I copy and paste, no longer shall I be ok with Getting Shit Done.
I got it that at a high level; much the same way that Angular does it, React allows me to componentise my App so that I have discreet elements to work on.
I appreciated this was valuable as my initial work on the Ethereum Visualiser turned into a mess of Javascript Function spaghetti very quickly and was so buggy that I was reticent to push it to my GitHub repo, but I did. You can see the next iteration of it “live” at http://inkl.in. I say “live” as it is the same code, with a different Force Graph visualisation reading in JSON files for each Ethereum block. There is nothing dynamic, and it isn’t very useful apart from looking kind of pretty.
When I worked at Bloomberg, I saw first hand how turning Data into Information was key, and Bloomberg has access to all of that Market Data from global exchanges, and makes it valuable to a huge audience of people.
And here we have Blockchain, the entire chain is readable and Open Source — this peaked my interest. What Information can you get from the Data that is useful?
So I have now embarked on using React to make a visual, and interactive Ethereum explorer. I’ve learnt how to programmatically parse Ethereum data, Contracts, how to clean the data so it is Reference data and able to query it for some use (although we shall see how useful it will be later)
The (very rough) flow of the data is…
As Julie Walters would say; Let’s start at the very beginning.
To be able to query the Ethereum Blockchain you need to download the database itself. It’s relatively simple; either install Geth or Parity onto your machine.
Parity by default will attempt to do a Fast sync of the block chain (at the time of writing there are well over 5 million blocks in the database). You should be able to get a good chunk of the database locally in a few hours. It can do this as your machine does not need to verify all previously verified blocks. It does slow down after it has downloaded the DB snapshot and it looks likes it starts verifying the blocks it receives. All in, it took about 24 hours to get up to date locally.
Both Geth and Parity have an RPC interface that can be accessed via a Web3 interface in various languages. I have to say the documentation is pretty good (with examples) to start querying the blockchain.
I started messing around with the Javascript interface for the YouTube clip ^up there, but when I realised I needed to start getting the data into a format that could be queried, I started working with the Ruby Web3 Gem for bulk imports.
For my local developer instance, I’ll use Mongo installed on my Mac, and I’ll use Mongoid for the interface from Ruby.
I initially started the exercise by structuring the data within Mongo as:
block: {
block_id: NUMBER
transactions: [tx1, tx2, tx3]
}
But realised that the complexity and inefficiency of querying an array within a document was not going to allow me to search efficiently by Contract or Token on the Blockchain — something I saw as a key data point to glean something useful.
The format of the data now is a 1:1 mapping of an Ethereum Transaction to a Document within the collection.
The code should be relatively self explanatory, maybe apart from the syncing check.
When the Ethereum client is attached to the network, it will attempt to sync the database locally. As this is happening, you really can only read and import data that is local, you can’t query the network itself. Because I didn’t want to wait, I would ask the local RPC endpoint what the latest block is that has been synced with
web3.eth.syncing
When this is in Production, the assumption is that it will be fully synced, and only receiving new Blocks as they appear on the network.
This is when we need to call
web3.eth.blockNumber
In my design, the import script will actually be sitting within a Container on a Kubernetes cluster (as will Parity, running as a Service), and it will loop to ingest any new data. I get that I could have subscribed to events on new Blocks, but I’m not looking for realtime data, I’m ok with near-realtime and some resilience to make sure I have all data. 1 minute delay is not going to cause any Data => Information quality issues.
I now have a lot of data in a queryable format — there is still a lot that has to be done to make sense of this, and that is the journey I’ll be documenting over the next few weeks.
Oh, and at last count (I still don’t have all data ingested), I’m looking at upwards of 150,000,000 documents beating the crap out of my poor Mac SSD. But that’s enough for a Dev environment for now.