Should I use X-buffers to Serialize Data?

X-buffer? what do you mean?, easy, Flat-buffers and Protocol-buffers (this is not official, it is just a way I decided to group both terms)

Published in

CodeX

7 min readApr 13, 2021

Every time a new technology or framework surges it is very common to think that is just better in all senses than the previous ones because is newer, however, it is important to understand what is the problem that aims to solve and if it is good enough in all contexts, that’s why I decided to write this post, in order to understand if the X-Buffers are really better than JSON.

In order to determine if using X-Buffers is a good idea, I will go through different scenarios doing a Golang Benchmark. This is the schema to be used to do the comparisons (full schema here).

type Customer struct {
  FirstName   string               
  LastName    string               
  Age         uint32               
  Balance     float64              
  Debt        float64              
  Preferences *Preferences         
  Friends     []*Customer          
  Addresses   map[string]*Location 
}

If you are new with the Protocol Buffer or FlatBuffer, you can take a look to these resources before reading further.

Experiments

Now let’s proceed with some benchmarking and to do that I will use 3 different levels of complexity so we can see how each serialization format behaves under different levels of stress:

Level 1, The customer object with 1 address.
Level 2, The customer object with 10 addresses, 10 friends and each friend contains 10 addresses.
Level 3, The customer object with 100 addresses, 200 friends and each friend contains 100 addresses.

All test were run over a machine CPU Intel(R) Core(TM) i5–6267U CPU @ 2.90GHz with Golang 1.16.2, the results might vary in other languages or machines.

Level 1 Marshaling metrics

JSON   354464  3442 ns/op  1075 B/op  14 allocs/op
Proto  467773  2188 ns/op  1088 B/op  15 allocs/op
FBS    697542  1690 ns/op  1304 B/op  10 allocs/opSize in bytes
JSON:371, Proto:143, Fbs:296

I have to admit that the first time I run these tests I had to revisit all the implementation because previously I had the misconception that FlatBuffer was much better in all senses than the JSON and Protocol Buffer format, but as you can see that’s not the case for our simplest scenario.

Level 2 Marshaling metrics

JSON    6504   177418 ns/op  52443 B/op  619 allocs/op
Proto   9921   128793 ns/op  43157 B/op  717 allocs/op
FBS     10000  116492 ns/op  62473 B/op  326 allocs/opSize in KB
JSON:17, Proto:8.9, Fbs:13

In this scenario, JSON is not the winner anymore in any of the cases, and Protocol Buffer shortened the distance in time spent with FlatBuffer, as a conclusion Protocol Buffer looks like our best option for this case, now let’s contrast the behavior with a more extreme scenario.

Level 3 Marshaling metrics

JSON    39  29045645 ns/op  9810394 B/op  84209 allocs/op
Proto   55  21001219 ns/op  6334196 B/op  103901 allocs/op
FBS     78  13158922 ns/op  8333920 B/op  42052 allocs/opSize in MB
JSON:2.5, Proto:1.2, Fbs:2.0

Remember this time I used 100 addresses, 200 friends, and each friend has 100 addresses which is an uncommon use case but possible, I came across such big objects in a project once.

For this extreme case the worst in all senses is the JSON format, the FlatBuffer is now even faster than Proto Buffer compared last time and the trend about the size is consistent regarding the others tests.

Level 1 Unmarshalling stats

JSON    134371     10778 ns/op  1000 B/op  26 allocs/op
Proto   728599     1592 ns/op   733 B/op   13 allocs/op
FBS     510885903  2.365 ns/op  0 B/op     0 allocs/op

Yes, the picture is right, FlatBuffer does not have allocations trying to parse binary data into the object, also the time is almost 0 (2 ns), in the case of Protocol Buffer it takes 0.7 KB and JSON 1 KB in memory allocations, and regarding the time Protocol Buffer spent around 0.001 ms and JSON 0.01 ms, let’s see what we have next.

Level 2 Unmarshalling stats

JSON     2886       353497 ns/op  32025 B/op  1000 allocs/op
Proto    13471      91999 ns/op   31446 B/op  705 allocs/op
FBS      444298354  2.490 ns/op   0 B/op      0 allocs/op

The FlatBuffer is not affected by the size of the object nor the complexity and keeps the previous metrics, on the other hand, Protocol Buffer and JSON are close to each other in terms of memory allocation, Protocol Buffer had 31 KB and JSON 32 KB, however, in time spent Protocol Buffer is better with 0.09 ms against 0.3 ms.

Level 3 Unmarshalling stats

JSON    22         50498996 ns/op   5849893 B/op  148547 allocs/op
Proto   78         14708673 ns/op   4989886 B/op  105029 allocs/op
FBS     527564692  2.248 ns/op      0 B/op        0 allocs/op

Again FlatBuffer kept the pace, Protocol Buffer took around 4.9 MB and JSON 5.8 MB in memory allocation (per operation), and in terms of time spent Protocol Buffer took 14 ms while JSON 50 ms parsing the object, quite an important difference.

But why Flatbuffer is so performant in unmarshalling the object?, well to be fair you need to know that Flat Buffer does not even unmarshal the bytes, what it does in the marshaling process is to accommodate the bytes by offsets and vtables to be able to access to the data later on, thus, instead of re-building the whole object what it does is prepare the byte array and look for the field(s) based on the data type and/or length on demand (example here).

If you are very happy with the Flat Buffer results showed above just wait a minute, first you need to know what are the downsides of using it, these are some of them I have come across:

Debugging binary messages is very hard, you cannot just log the payload or use middleboxes (firewall, Proxy, pub/sub, etc) to analyze the payload and take decisions based on that, this also applies to Protocol Buffers.
The implementation is tedious, as you can see here, build the FlatBuffer object requires much more caution and steps than Protocol Buffer and JSON.
There are some features missing in some languages, for instance, the binary search (Maps) is not yet available for Golang or Rust.

Before wrapping up let me share some final thoughts:

The reason why it is important to take care of the memory allocation is because of memory management. In Golang the part taking care of this is the Garbage Collector (GC’s) and is in charge of freeing the memory no longer used by our programs, and to make a story short the GC’s is triggered (by default) when the heap memory is more than 4MB or if this wasn’t launched in the last 2 minutes, so if your GC’s needs to work very hard cleaning memory all the time this will consume the CPU you need for running your program.
If after reading this you think using JSON is a bad idea, let me tell you the opposite, for regular scenarios, this is my first choice, it is widely supported, easy to debug, and as you saw the difference in normal cases is not excessive.
According to the previous answer you might be wondering when it is a good chance then for using X-Buffers, well, I would personally use X-Buffers for Service to Service communication, Cache storage, when the business requires a very low latency like in games, or you’re working under very limited environments (network, disk, memory) like IoT.

Bonus:

There is a fork of the Golang Protobuf project called GoGo Protobuf which promises to reduce the bytes allocated, let’s see some stats using the gogofaster approach:

marsahling

GogoProtoL1  1244660   944.4 ns/op     627 B/op      9 allocs/op
GogoProtoL2  17931     67348 ns/op     31112 B/op    433 allocs/op
GogoProtoL3  104       11383685 ns/op  4968503 B/op  62899 allocs/op

The implementation with GoGo Protobuf is truly improved compared to the standard Protocol Buffer implementation:

Level 1: Byte allocation is 44% less, and the speed is 56% better
Level 2: Byte allocation is 27% less, and the speed 47% better
Level 3: Byte allocation is 22% less, and the speed 47% better
The size is exactly the same in all scenarios
The stats of Gogo Protobuf are the best in all scenarios including FlatBuffer and JSON.

unmarshalling

GogoProtoL1   1863813  628.5 ns/op     528 B/op      11 allocs/op
GogoProtoL2   34784    34081 ns/op     22206 B/op    465 allocs/op
GogoProtoL3   193      5784670 ns/op   3625775 B/op  64031 allocs/op

Level 1: Byte allocation is 31% less and the speed is 40% better

Level 2: Byte allocation is 29% less and the speed is 66% better

Level 3: Byte allocation is 25% less and the speed is 59% betters

2. You can also use Flat-Buffers with gRPC and leverage HTTP2 for better performance, check here for more information and here an example