Protobufs: A Faster Backend
Can we do better than JSON?
Making fetch requests for data is a trivial task. Seasoned developers could setup a backend server with various endpoints practically in their sleep. But what some may not realize is how exactly their data is being sent within normal HTTP requests.
JSON is currently the most common method of serializing data. This is due to the fact that JSON objects are sent as easily readable strings. But you already knew that. So why is this worth mentioning? JSON objects actually carry extra “baggage” in the form of structural information. Imagine a JSON object without brackets, quotation marks, spaces, etc; all the characters that give it its structure. How much smaller would your object be without these?
Herein lies the value of protocol buffers, or protobufs for short. Protobufs are a method of serializing data into a binary stream. To do this, they rely on schemas. These schemas encode and decode the data, and must be identical between the sender and receiver (client-server) before data can be parsed correctly. To give an analogy, do you remember when you were a kid and you and your friends created a secret language with a lexicon that only your group knew about? The protobuf would be the message, and the cipher would be the schema. No? Just me? Well, you know I mean.
Protobufs are sent and received up to six times faster than JSON. So I know what you might thinking: “If protobufs are faster than JSON, why doesn’t everybody use them?” The main reasons are difficult setup and scarcity of use. Normal HTTP requests are universally supported in all modern languages, and protobufs require special frameworks. They also require .proto files that must be exactly identical on both ends. Not to mention - the syntax for writing services, requests, and messages can be tricky.
So when you consider scale, shaving fractions of milliseconds off of an already near-instantaneous transfer is most likely not worth all the trouble of setting up protobufs. Transitioning a company’s entire backend to utilize protobufs can be a long, arduous process that could waste money and working hours.
With all this hassle, who would bother? The behemoths; at Google, Amazon, Uber, Airbnb, and Dropbox, it is almost a requirement. When you consider the millions of requests and queries that these companies handle on a daily basis, even a minor increase in efficiency would save untold amounts of time and money. Well, it’s probably told somewhere, but that’s not for us to know.
Petra Bierleutgeb, Engineer at lenses.io, recently spoke at the microXchg 2018 conference in Berlin about the benefits of gRPC and protobufs, stating:
“The fundamental principles of gRPC that have been kept in mind when designing it: it’s really open and extensible…it’s agnostic to what underlying technologies you’re using. You could decide not to use protocol buffers. You could use Thrift or JSON. It has a very pluggable architecture, so you can very easily extend it with things like security, health checks, metrics.”
Check out her full talk here.
Each company has a different name for their Remote Procedure Call (RPC) framework, the systems in which they send serialized information. Facebook has Apache Avro and Uber has TChannel. But in classic naming convention, Google’s RPC is the recursively named gRPC. Other companies such as Netflix, Square, and Cisco have already adopted gRPC.
If I’ve sufficiently piqued your interest, my team and I have actually made a testing tool that you can use to explore using gRPCs. It comes with a preconfigured server and a .proto file out of the box, and supports all basic connection types. You can find our website here.