It seems to be a long journey from first episodes to here, we have gotten all we need and this time, it will be the last shot to make everything clear about how Ethereum data organized in practice.
In my opinion, practising with examples is the best way to approach and get deep into any problems. By creating an example, I truly hope it can help you understand clearly about data structure in Ethereum.
You can clone my git (develop branch) for more convenient if you don’t want to get your hands dirty.
- Full node: Geth
- Network: Ropsten testnet
- Subject of the study: state trie (stateRoot)
Following this link, you will know the way to setup geth as a full node in your computer.
To start full sync mode of Ropsten testnet and open RPC, you can use this command:
geth --testnet --datadir "~/Library/Ethereum/ropsten" --rpc --rpcapi "eth,net,personal,web3" --rpcaddr "0.0.0.0" --rpccorsdomain "*" --ws --wsapi "eth,net,personal,web3" --wsorigins "0.0.0.0"
And remember that you should change the parameter after
--datadir flag to be available to you.
Because we run geth in full sync mode, it means geth need time (it’s pretty long 😂 in my situation, around 3 days) to sync entire blockchain data. When you see the logs like that, I’m pretty sure it done.
Web3 — Testing geth
Refer this link for web3:
First of all, we need to create a nodeJS project and then install web3 package.
getStateRoot function with
blockNumber is nearly newest block number on Ropsten. We avoid to get the newest because it will lead to risk of delaying sync, so nearly newest block number is a wise choice. My choice is
2596315 and it may be different at the moment you read this article. Be careful.
My result of running
At this time, we can make sure that our full node work perfectly.
leveldb - LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string…
Something we need to be careful about levelDB is that it merely allows one connection at one time. Thus, we need to stop geth after full sync for next steps.
In order to create a connection by NodeJS, we will use 2 packages are
leveldown. So please install
Create a connection:
Here, I tried connectting to my levelDB with specific path that points to my chaindata folder (this path depends on your config when we start geth). And then, I globalized it to use afterward.
First diving into database
In the Web3 — Testing geth part, I got stateRoot of block number 2596315. Because we used web3, so the result is certainly correct.
Now, we warn up by getting stateRoot in a block header corresponding with a specific block number and then we compare it to the previous result in the Web3 — Testing geth part.
ethereumjs-block module first, we need it to parse block data.
utils library, please take a look at my repo to get source code. The path is
First steps, we need to pad a number of
0 to the left of
2596315 so that total length will be 16, notice that everything we do will be in hex.
hexBlockNumber = 00 00 00 00 00 27 9d db
In geth, they used
h as prefix and
n as suffix.
prefix = 68
suffix = 6e
And then, we concatenate all of them in sequence.
keyString = prefix + hexBlockNumber + suffix = 68 00 00 00 00 00 27 9d db 6e
Here the result:
As we can see, the final result is the same with the result in web3 part.
Congratulation!!! We got a first diving into a real 💩
We will use
rlp module, let’s install it.
Now, we are starting to create a
trie library that uses an ethereum address to parse whole info saved in state trie.
getInfoByAddress function, we use
merkle-patricia-tree to create trie with
root inputed, then we get data of an address by this trie. Remember that all data was encoded by
rlp before saved down, in order to read it out, we need to decode them.
This is completed example:
An address data contains 4 info. In sequence, they are nonce, balance, storageRoot and codeHash.
This is not the end of this series, we will have something about Prunning Tree. But it maybe will be shared in the future because my knowledge about it still not much.
Maybe sad when you hear that :)))
LevelDB in Geth, key and values
When parsing through the levelDB or RocksDB (Depending on the client you are using) there are string values…