Protobuf.js vs JSON.stringify Performance

Published in

Aspecto

4 min readApr 8, 2021

TL;DR — encoding and decoding string-intensive data in JavaScript is faster with JSON than it is with protobuf.

When you have structured data in JavaScript, which needs to be sent over the network (for another microservice for example) or saved into a storage system, it first needs to be serialized. The serialization process converts the data object you have in the JavaScript program memory, into a buffer of bytes, which can then be deserialized back into a JavaScript object.

Two popular serialization methods are JSON and Google Protocol Buffers (protobuf)

JSON

Serializing data to JSON is as easy as:

protobuf.js

Google Protocol Buffers is a method of serializing structure data that is based on a scheme (written in .proto file). Example of how to serialize the previous payload to Protobuf with the protobufjs package:

You can see that the generated output is only 7 bytes long, much less than the 23 bytes we got on JSON serialization. Protobuf is able to serialize data so compactly mainly because it does not need to embed the field names as text in the data, possibly many times (“name” and “age” in the example are replaced by short descriptors of 2 bytes).

Picking the Right Format

Choosing the right serialization format that works best for you is a task that involves multiple factors.

JSON is usually easier to debug (the serialized format is human-readable), and easier to work with (no need to define message types, compile them, install additional libraries etc).

Protobuf on the other hand usually compresses data better and has built-in protocol documentation via the schema.

Another important factor is the CPU performance — the time it takes for the library to serialize and deserializes a message. In this article, we want to compare just the performance in JavaScript. You might eventually choose a format that is less performant but delivers value in other factors, or performance might be a big issue for you, in which case — keep reading.

Encode Performance

At Aspecto, we wrote an SDK that collects trace events and exports them to an open-telemetry collector. The data is formatted as JSON and sent over http. The exporter and collector can also communicate in protobuf using the protobufjs library.

Since the protobuf format is so compressed, we might be tempted to think that encoding to protobuf requires less CPU, which is measured as number of operations (encode/decode) in a second. A quick google search on the topic strengthen this thesis, and the Performance Section in protobufjs documentation led us to replace our SDK exporter from JSON to protobuf payload, believing we will get better performance.

Actual Performance

After changing from JSON serialization to protobuf serialization, we run our SDK benchmark and surprisingly found that performance decreased. That observation, which we first believed was a mistake, sent us to further investigate the issue.

Benchmarking — baseline

First of all, we run the original benchmark of protobufjs library, to get a solid ground to start from. Indeed we got results similar to the library README:

These results show that protobuf.js performance is better than JSON, which was in contrast to our previous observation.

Benchmark — telemetry data

We then modified the benchmark to encode our example data which is an opentelemetry trace data. We copied the proto files and data to the benchmark, and got the following results:

These were the results we expected — for this data, protobuf is actually slower than JSON.

Benchmark — strings

We got two results for two different data schemas. In one - protobufjs is faster, and in the second — JSON is faster. Looking at the schemas, the immediate suspect was the number of strings. Our schemas are composed almost entirely of strings. So we created a third test, populating a simple schema with many many many strings:

We run the benchmark with this payload (10,000 strings, of length 10 each)

And the results proved our suspicion:

When your data is composed of many strings, protobuf performance in JavaScript drops below those of JSON. It might be related to JSON.stringify function being implemented in C++ inside V8 engine, and highly optimized compared to the JS implementation of protobufjs.

Decoding

The benchmarks above are for encoding (serializing). The benchmarks results for decoding (deserializing) are similar.

Conclusion

If you have the time, our recommendation is to profile your common data, understand the expected performance of each option, and choose the format that works best for your needs.

It is important to be aware that protobuf is not necessarily the fastest option. If your data is mainly string, then JSON format might be a good choice.