HP encoding in Golang.

Data structure in Ethereum | Episode 1+: Compact (Hex-prefix) encoding.

Coinmonks
Published in
4 min readJan 9, 2018

--

In the episode 1, we discussed RLP Encoding/Decoding, however Ethereum still has another encoding called Compact encoding or Hex-prefix (HP) encoding. So, what are the differences between RLP and HP encoding? and How does HP encoding work? This article will help us to get it.

Moreover, I am going to explain some confused terminologies that really helpful for understanding this article, also the next episode about the trie in Ethereum.

Some terminologies

Before we go to HP encoding, I will try to introduce some terminologies in advance. However, this part relates to Trie in Ethereum and perhaps you are not familiar with it. Don’t worry about that, I will try to skip that things and let us approach it by an intuitional way.

Let’s play a game. I will give you a map and a path, you have to find the animal which I am implying.

The path is: 3–2–3

The map of animals

In order to solve this game, we do step by step following:

  1. Start at Root and the first element in path is 3, it means we must get the element of Root which labeled by number 3 and we can see that the value of that element is A.
  2. Find node A, the next element in path is 2 so we get the value in node A is D.
  3. Find node D, the last element in path is 3, obviously the value we find out is ‘Whale’.

So, the animal of the game is whale. Let’s try another path: 3–1–3–2 (the result is ‘Chicken’).

By this game we get familiar with some terminologies:

  • Key: Root, A, B, C, D and E in game are keys
  • Node: the content corresponding with the key in the right part of each row. For example: {1: ‘Dog’, 2: B, 3: A}.
  • Path: path is like 2–2–3 in game.
  • Value: you can see it has some elements in all nodes, every element is a key-value pair. Value is the right part of element, value can be a key or a name of animal.
  • Nibble: hex form of 4 bits is a nibble. For example: 0x1, 0x4, 0xf

Differentiate RLD from HP encoding

It’s really easy to differentiate RLP from HP encoding, RLP is used for encoding/decoding Value and HP encoding is used for encoding/decoding Path.

HP encoding goals

We will meet 2 more terminologies which are leaf and extension in this part, but we can blur them and discuss the details in the episode 3.

Leaf and extension are 2 kinds of node, however path of leaf has terminator and extension does not. Terminator is the last byte of the path and has value of 16 in dec or 0x10 in hex.

As we can see, it’s possible to the path has odd length but odd length is not friendly to machines. So we have to convert all odd-length paths to even-length paths.

In summary, HP encoding goals are:

  1. Distinguish leaf and extension from each other without terminator.
  2. Convert the path to even length.

HP encoding specification

For now, we call the path inputed in HP encoding as input for convenience.

  • If input has terminator, remove terminator from input.
  • Create the prefix into input which has value as following table:
Hex-prefix table
  • If the prefix is 0x0 or 0x2 , add a padding nibble 0 follow the prefix, so the prefix is 0x00 and 0x20 . The main reason to do that is we are trying to maintain the even-length attribute of the path. Convince yourself :).
  • Add prefix to path.

For example:

Examples of HP encoding

Conclusion

Now, I believe that we got all of weapons to make a battle with Trie in Ethereum. In the next episode, we will get deep into Trie that is the most excited part I can wait to discuss with you.

References

Wiki Ethereum

HP encoding on original Golang code:

HP encoding on my github with some comments of explanation.

Get Best Software Deals Directly In Your Inbox

--

--

Phan Sơn Tự
Coinmonks

A lucky guy was born in the Age of Cryptocurrency Boom