Should I use X-buffers to Serialize Data?

X-buffer? what do you mean?, easy, Flat-buffers and Protocol-buffers (this is not official, it is just a way I decided to group both terms)

Andrés Pérez
CodeX
7 min readApr 13, 2021

--

Photo by Kevin Ku on Unsplash

Every time a new technology or framework surges it is very common to think that is just better in all senses than the previous ones because is newer, however, it is important to understand what is the problem that aims to solve and if it is good enough in all contexts, that’s why I decided to write this post, in order to understand if the X-Buffers are really better than JSON.

In order to determine if using X-Buffers is a good idea, I will go through different scenarios doing a Golang Benchmark. This is the schema to be used to do the comparisons (full schema here).

If you are new with the Protocol Buffer or FlatBuffer, you can take a look to these resources before reading further.

Experiments

Now let’s proceed with some benchmarking and to do that I will use 3 different levels of complexity so we can see how each serialization format behaves under different levels of stress:

  • Level 1, The customer object with 1 address.
  • Level 2, The customer object with 10 addresses, 10 friends and each friend contains 10 addresses.
  • Level 3, The customer object with 100 addresses, 200 friends and each friend contains 100 addresses.

All test were run over a machine CPU Intel(R) Core(TM) i5–6267U CPU @ 2.90GHz with Golang 1.16.2, the results might vary in other languages or machines.

Level 1 Marshaling metrics

Marshal level 1

I have to admit that the first time I run these tests I had to revisit all the implementation because previously I had the misconception that FlatBuffer was much better in all senses than the JSON and Protocol Buffer format, but as you can see that’s not the case for our simplest scenario.

Level 2 Marshaling metrics

Marshal level 2

In this scenario, JSON is not the winner anymore in any of the cases, and Protocol Buffer shortened the distance in time spent with FlatBuffer, as a conclusion Protocol Buffer looks like our best option for this case, now let’s contrast the behavior with a more extreme scenario.

Level 3 Marshaling metrics

Marshal level 3

Remember this time I used 100 addresses, 200 friends, and each friend has 100 addresses which is an uncommon use case but possible, I came across such big objects in a project once.

For this extreme case the worst in all senses is the JSON format, the FlatBuffer is now even faster than Proto Buffer compared last time and the trend about the size is consistent regarding the others tests.

Level 1 Unmarshalling stats

Unmarshal Level 1

Yes, the picture is right, FlatBuffer does not have allocations trying to parse binary data into the object, also the time is almost 0 (2 ns), in the case of Protocol Buffer it takes 0.7 KB and JSON 1 KB in memory allocations, and regarding the time Protocol Buffer spent around 0.001 ms and JSON 0.01 ms, let’s see what we have next.

Level 2 Unmarshalling stats

Unmarshal level 2

The FlatBuffer is not affected by the size of the object nor the complexity and keeps the previous metrics, on the other hand, Protocol Buffer and JSON are close to each other in terms of memory allocation, Protocol Buffer had 31 KB and JSON 32 KB, however, in time spent Protocol Buffer is better with 0.09 ms against 0.3 ms.

Level 3 Unmarshalling stats

Unmarshal level 3

Again FlatBuffer kept the pace, Protocol Buffer took around 4.9 MB and JSON 5.8 MB in memory allocation (per operation), and in terms of time spent Protocol Buffer took 14 ms while JSON 50 ms parsing the object, quite an important difference.

But why Flatbuffer is so performant in unmarshalling the object?, well to be fair you need to know that Flat Buffer does not even unmarshal the bytes, what it does in the marshaling process is to accommodate the bytes by offsets and vtables to be able to access to the data later on, thus, instead of re-building the whole object what it does is prepare the byte array and look for the field(s) based on the data type and/or length on demand (example here).

If you are very happy with the Flat Buffer results showed above just wait a minute, first you need to know what are the downsides of using it, these are some of them I have come across:

  • Debugging binary messages is very hard, you cannot just log the payload or use middleboxes (firewall, Proxy, pub/sub, etc) to analyze the payload and take decisions based on that, this also applies to Protocol Buffers.
  • The implementation is tedious, as you can see here, build the FlatBuffer object requires much more caution and steps than Protocol Buffer and JSON.
  • There are some features missing in some languages, for instance, the binary search (Maps) is not yet available for Golang or Rust.

Before wrapping up let me share some final thoughts:

  • The reason why it is important to take care of the memory allocation is because of memory management. In Golang the part taking care of this is the Garbage Collector (GC’s) and is in charge of freeing the memory no longer used by our programs, and to make a story short the GC’s is triggered (by default) when the heap memory is more than 4MB or if this wasn’t launched in the last 2 minutes, so if your GC’s needs to work very hard cleaning memory all the time this will consume the CPU you need for running your program.
  • If after reading this you think using JSON is a bad idea, let me tell you the opposite, for regular scenarios, this is my first choice, it is widely supported, easy to debug, and as you saw the difference in normal cases is not excessive.
  • According to the previous answer you might be wondering when it is a good chance then for using X-Buffers, well, I would personally use X-Buffers for Service to Service communication, Cache storage, when the business requires a very low latency like in games, or you’re working under very limited environments (network, disk, memory) like IoT.

Bonus:

  1. There is a fork of the Golang Protobuf project called GoGo Protobuf which promises to reduce the bytes allocated, let’s see some stats using the gogofaster approach:

marsahling

The implementation with GoGo Protobuf is truly improved compared to the standard Protocol Buffer implementation:

  • Level 1: Byte allocation is 44% less, and the speed is 56% better
  • Level 2: Byte allocation is 27% less, and the speed 47% better
  • Level 3: Byte allocation is 22% less, and the speed 47% better
  • The size is exactly the same in all scenarios
  • The stats of Gogo Protobuf are the best in all scenarios including FlatBuffer and JSON.

unmarshalling

Level 1: Byte allocation is 31% less and the speed is 40% better

Level 2: Byte allocation is 29% less and the speed is 66% better

Level 3: Byte allocation is 25% less and the speed is 59% betters

2. You can also use Flat-Buffers with gRPC and leverage HTTP2 for better performance, check here for more information and here an example

--

--