JSON vs Proto (gRPC) vs Flatbuffer. Speed Showdown for Mobile App Backends! 🚀

Ilia
15 min readDec 2, 2023
Photo by amirali mirhashemian on Unsplash

Hey colleagues!

In this article, I want to spill the beans on a scenario where we’re all about pinching pennies on everything that crosses our path! I’ll be tackling this case from the backend perspective, where our integrator is none other than our own mobile app.

What’s the usual challenge we’re facing? It’s either cranking out a quick microservice or a combo of services that’ll be catching requests from the app. Most times, our clients are rocking top-notch gear and flagship devices. But what if our case is:

  1. A feeble AWS cluster that needs to squeeze in over 10 logic services plus monitoring.
  2. Our phones are like special Android gadgets with no more than 4GB RAM, often tablets.
  3. We’re frequently shooting snapshots from the app to the backend.
  4. We need to validate a chunk of data before pushing it further down the business flow.

So, here’s the deal — weak backend, feeble devices, 1 MVP, and 3 devs. The mission is to expand as minimally as possible, not blowing extra cash on AWS while we’re in the MVP zone. If the MVP nails it, resources flow; if not, the project might hit pause. Sounds like a challenge? We rolled up our sleeves and started experimenting. What’s on the market and what we’re doing with services:

  1. REST (JSON)
  2. gRPC (Proto, binary)
  3. “Special guest”

Let’s dive into these two (real quick, as there are tons of articles on this), starting with the differences:

HTTP (Hypertext Transfer Protocol) and gRPC (gRPC Remote Procedure Calls) are two different protocols for client-server communication.

HTTP:

  • Type: Oriented towards transferring textual information.
  • Protocol: Stateless.
  • Data Format: Typically JSON or XML.
  • Transport: Uses the TCP protocol.

gRPC:

  • Type: Oriented towards transferring binary data and structured messages.
  • Protocol: Supports state and duplex communication.
  • Data Format: Protocol Buffers (protobuf) — a binary data serialization format.
  • Transport: Uses HTTP/2 as the transport protocol.

In summary, gRPC provides more efficient and compact binary communication compared to the textual nature of HTTP. Additionally, gRPC supports more complex interaction scenarios and data formats. HTTP, on the other hand, remains a simpler and more widely used protocol for data transfer on the internet.

Let’s take a little detour and talk about integration — handing over the specs to both infrastructure and mobile development, or simply, sharing contacts.

In rapid development, this is crucial because you’ve got to think about backward compatibility, and all that jazz. If you’re just changing everything on the fly, not storing info anywhere — it’s a recipe for disaster. You’ll end up chasing bugs related to backward compatibility! Plus, you need to nail down versioning.

In the simple REST world, we use Swagger or OAS3 tools like Apicurio and such — saves time, makes the process more transparent, but still takes time. And guess what, let’s take another look at Protobuf — it already comes with a schema, and there’s a version of it (if we store it in Git) — a massive plus. Share it with the team, and now everyone has the spec.

Second crucial point — we’re banning fatal changes! Meaning, if a field is no longer needed or its type changes, we deprecate it in the new version, but we don’t delete it!

Let’s take a quick break to introduce gRPC, just in case you’re a newbie in this realm. gRPC is like the rockstar of communication protocols — it’s a high-performance, open-source universal RPC (Remote Procedure Call) framework. Think of it as your go-to method for efficient communication between services, especially in a microservices architecture.

Protocol Buffers (protobuf) is a method for serializing structured data developed by Google. Protocol Buffers provide an efficient and fast way to transmit data between systems, along with ease of use. Let’s look at a simple example using the data definition language (proto) and using protobuf in a programming language, for instance, in Python.

syntax = "proto3";

message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}

Here, we define the data structure `Person` with three fields: `name`, `id`, and `email`. Compiling the proto file:

protoc - python_out=. example.proto

This will create the file `example_pb2.py`, which contains the generated code for working with the data defined in `example.proto`. Usage in Python:

import example_pb2

# Create a Person object
person = example_pb2.Person()
person.name = "John"
person.id = 123
person.email = "john@example.com"

# Serialize to binary format
serialized_data = person.SerializeToString()

# Deserialize from binary format
new_person = example_pb2.Person()
new_person.ParseFromString(serialized_data)з

Here, we create a `Person` object, set its fields, serialize it to a binary format, and then deserialize it back. Note: `example_pb2` is the generated module created by the protobuf compiler.

Protocol Buffers provide a binary data format that is compact and efficient for transmission. It also supports various programming languages, making it convenient for use in different parts of your technology stack.

Still no idea how it works? Let’s use the previous example of `Person` Imagine we have a `Person` object with filled fields:

Person person = {
name: "John Doe",
id: 123,
email: "john@example.com"
};

When this object is serialized into binary format, each field will be represented as a tagged element. In this case, the tags are the numbers 1, 2, and 3. After serialization, the binary data stream might look something like this (in a simplified form):

08 4A 6F 68 6E 20 44 6F 65 10 7B 1A 14 6A 6F 68 6E 40 65 78 61 6D 70 6C 65 2E 63 6F 6D

Let’s break it down:

- 08 represents tag 1 (the `name` field), followed by the field’s length.
- 4A 6F 68 6E 20 44 6F 65 represents the ASCII codes for the string “John Doe.”
- 10 represents tag 2 (the `id` field), followed by the value `123` in variable-length encoding (Varint).
- 1A represents tag 3 (the `email` field), followed by the string length `20` and the ASCII codes for the string “john@example.com.”

Thus, tags and their order allow parsing the serialized data stream and determining which field contains what information.

Benefits of using a binary format and tags:

  1. Efficiency: Binary format provides a more compact representation of data, reducing the volume of transmitted information over the network.
  2. Speed: Serialization and deserialization operations are faster since binary data can be processed efficiently.

We’ve covered the theory; now let’s dive into coding two small services. We need to create a new api with two main methods:

  1. save new doc with params
  2. get all by limit offset

That’s all.

Let’s code!

Let’s begin by creating a basic JSON document. For instance, we have a compact service with just two methods:

  1. save docs and validate department code, delivery company and address.
  2. find all with limit/offset pagination.

We’ll utilize this JSON example across all services and develop proto and flatbuffer specification files for it.

Our typical document:

{
"docs": {
"name": "name_for_documents",
"department": {
"code": "uuid_code",
"time": 123123123,
"employee": {
"name": "Ivan",
"surname": "Polich",
"code": "uuidv4"
}
},
"price": {
"categoryA": "1.0",
"categoryB": "2.0",
"categoryC": "3.0"
},
"owner": {
"uuid": "uuid",
"secret": "dsfdwr32fd0fdspsod"
},
"data": {
"transaction": {
"type": "CODE",
"uuid": "df23erd0sfods0fw",
"pointCode": "01"
}
},
"delivery": {
"company": "TTC",
"address": {
"code": "01",
"country": "uk",
"street": "Main avenue",
"apartment": "1A"
}
},
"goods": [
{
"name": "toaster v12",
"amount": 15,
"code": "12312reds12313e1"
}
]
}
}

But before, technical requirements:

Language: Golang
http framework: Gin gonic
grpc framework: google grpc
database: mognodb

Load tests: for services tests I prefer to use yandex tank. If you still don’t know about the util, I recommend to read my article:

Json

Nothing special, we are going to create a small service with Gin gonic and http lib. As a good example:

Let’s create same service:

const (
post = "/report"
get = "/reports"
TTL = 5
)

func main() {
router := gin.Default()
p := ginprometheus.NewPrometheus("gin")
p.Use(router)

sv := service.NewReportService()
gw := middle.NewHttpGateway(*sv)

router.POST(post, gw.Save)
router.GET(get, gw.Find)

srv := &http.Server{
Addr: "localhost:8080",
Handler: router,
}
}

code here GitHub:

Benchmark tests:


// BenchmarkCreateAndMarshal-10 168706 7045 ns/op
func BenchmarkCreateAndMarshal(b *testing.B) {
for i := 0; i < b.N; i++ {
doc := createDoc()
_ = doc.Docs.Name // for tests

bt, err := json.Marshal(doc)
if err != nil {
log.Fatal("parse error")
}

parsedDoc := new(m.Document)
if json.Unmarshal(bt, parsedDoc) != nil {
log.Fatal("parse error")
}
_ = parsedDoc.Docs.Name
}
}

This code represents a benchmark for the `BenchmarkCreateAndMarshal` function, measuring the performance of create and marshal operations.

- `BenchmarkCreateAndMarshal-10`: This is the output line provided by the Go testing tool (e.g., `go test`). It indicates that this benchmark is run with the `-10` parameter, meaning it uses 10 parallel goroutines.

- `168706`: This is the number of iterations that were executed during the test.

- `7045 ns/op`: This is the average time taken for one iteration in nanoseconds. Here, `ns/op` stands for nanoseconds per operation.

Thus, the result indicates that the `BenchmarkCreateAndMarshal` function executes at approximately 7045 nanoseconds per operation over 168706 iterations.

Protobuf

Now and again, no worries if you’re new to gRPC! Taking it step by step is a great approach. I remember being in the same boat — copying and pasting from the documentation is a common practice when diving into new technologies. It’s a fantastic way to grasp the concepts and understand how things work. Keep exploring the guide, and don’t hesitate to reach out if you have any questions along the way. Happy coding! 😊🚀

Read more here: https://protobuf.dev/overview/

It’s time to create our proto service with a specification:

syntax = "proto3";

package docs;
option go_package = "proto-docs-service/docs";

service DocumentService {
rpc GetAllByLimitAndOffset(GetAllByLimitAndOffsetRequest) returns (GetAllByLimitAndOffsetResponse) {}
rpc Save(SaveRequest) returns (SaveResponse) {}
}

message GetAllByLimitAndOffsetRequest {
int32 limit = 1;
int32 offset = 2;
}

message GetAllByLimitAndOffsetResponse {
repeated Document documents = 1;
}

message SaveRequest {
Document document = 1;
}

message SaveResponse {
string message = 1;
}

message Document {
string name = 1;
Department department = 2;
Price price = 3;
Owner owner = 4;
Data data = 5;
Delivery delivery = 6;
repeated Goods goods = 7;
}

message Department {
string code = 1;
int64 time = 2;
Employee employee = 3;
}

message Employee {
string name = 1;
string surname = 2;
string code = 3;
}

message Price {
string categoryA = 1;
string categoryB = 2;
string categoryC = 3;
}

message Owner {
string uuid = 1;
string secret = 2;
}

message Data {
Transaction transaction = 1;
}

message Transaction {
string type = 1;
string uuid = 2;
string pointCode = 3;
}

message Delivery {
string company = 1;
Address address = 2;
}

message Address {
string code = 1;
string country = 2;
string street = 3;
string apartment = 4;
}

message Goods {
string name = 1;
int32 amount = 2;
string code = 3;
}

Build it:

# if it is your first downloading:
brew install protobuf
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.28
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@v1.2
export PATH="$PATH:$(go env GOPATH)/bin"

# only generator
cd .. && cd grpc
mkdir "docs"
protoc --go_out=./docs --go_opt=paths=source_relative \
--go-grpc_out=./docs --go-grpc_opt=paths=source_relative docs.proto

Our results:

How to test localy? I prefer to use BloomRpc (unfortunatelly, it has beed deprecated :D, also, Postman can do the same.)

Benchmarks:

// BenchmarkCreateAndMarshal-10       651063       1827 ns/op
func BenchmarkCreateAndMarshal(b *testing.B) {
for i := 0; i < b.N; i++ {
doc := CreateDoc()
_ = doc.GetName()
r, e := proto.Marshal(&doc)
if e != nil {
log.Fatal("problem with marshal")
}

nd := new(docs.Document)
if proto.Unmarshal(r, nd) != nil {
log.Fatal("problem with unmarshal")
}
_ = nd.GetName()
}
}

This code represents a benchmark named BenchmarkCreateAndMarshal, which measures the performance of creating and marshaling operations. The output line indicates that the benchmark is run with the -10 parameter, implying the use of 10 parallel goroutines. The results show that, on average, the benchmark performs these operations in 1827 nanoseconds per iteration over 651063 iterations.

So, the full code here: (GitHub)

FlatBuffers

And now, let’s introduce our guest among the protocols — most of you probably haven’t even heard of it — it’s FlatBuffers.

FlatBuffers steps onto the scene with a different swagger compared to the others. Picture this: no parsing needed on the client side. Why? Because the data is accessed directly, no unpacking required. This makes it super efficient for mobile devices and resource-constrained environments. It’s like handing over a ready-to-eat meal instead of making your client cook it up. The schema is minimalistic, and you get a flat binary right away. Plus, no versioning headache because you can add new fields without breaking anything — a win for the backward compatibility game. Example:

Person person;
person.id = 123;
person.name = "John Doe";
person.age = 30;

Certainly, let’s represent the serialized bytes in hexadecimal format for the given Person structure:

// Serialized bytes (hexadecimal representation)
// (assuming little-endian byte order)
1B 00 00 00 // Data size (including this byte)
7B 00 00 00 // ID (123 in little-endian byte order)
09 00 00 00 // Name string length (including null-terminator)
4A 6F 68 6E // Name ("John" in ASCII, including null-terminator)
20 00 00 00 // Age (30 in little-endian byte order)

In this example:

  • The first 4 bytes represent the data size, including this byte. In this case, the size is 27 bytes (0x1B).
  • The next 4 bytes represent the id (123 in little-endian byte order).
  • Following that, 4 bytes represent the length of the name string (9 bytes).
  • The subsequent 9 bytes represent the name string “John Doe,” including the null-terminator.
  • The last 4 bytes represent the age (30 in little-endian byte order).

Please note that this is just an illustration of the data structure in binary form, and the specific values may vary depending on the platform, byte order, and other factors.

Now, you might wonder, why isn’t everyone riding the FlatBuffers hype train? Well, it’s a bit niche, and the learning curve is steeper than a ski jump. But, if you’ve got the chops, it’s a performance beast. So, give FlatBuffers a nod when you’re in the protocol party mood.

This time around, we can’t just breeze through because we’ll have to write all the serialization code ourselves. However, we’re expecting some silver linings from this.

Sure, we’ll need to get our hands dirty with manual serialization coding, but the potential payoffs are worth it. It’s a bit of a trade-off — more effort upfront, but the control and potential performance boost might just make it a sweet deal in the long run. After all, sometimes you gotta get your hands deep in the code to make the magic happen, right?

Github:

// BenchmarkCreateAndMarshalBuilderPool-10      1681384        711.2 ns/op
func BenchmarkCreateAndMarshalBuilderPool(b *testing.B) {
builderPool := builder.NewBuilderPool(100)

for i := 0; i < b.N; i++ {
currentBuilder := builderPool.Get()

buf := BuildDocs(currentBuilder)
doc := sample.GetRootAsDocument(buf, 0)
_ = doc.Name()

sb := doc.Table().Bytes
cd := sample.GetRootAsDocument(sb, 0)
_ = cd.Name()

builderPool.Put(currentBuilder)
}
}

Since we’re in the “do-it-yourself optimization” mode, I decided to whip up a small pool of builders that I clear after use. This way, we can recycle them without allocating memory again and again.

It’s a bit like having a toolkit that we tidy up after each use — keeps things tidy and efficient. Why waste resources on creating new builders when we can repurpose the ones we’ve got, right? It’s all about that DIY efficiency.

const builderInitSize = 1024

// Pool - pool with builders.
type Pool struct {
mu sync.Mutex
pool chan *flatbuffers.Builder
maxCap int
}

// NewBuilderPool - create new pool with max capacity (maxCap)
func NewBuilderPool(maxCap int) *Pool {
return &Pool{
pool: make(chan *flatbuffers.Builder, maxCap),
maxCap: maxCap,
}
}

// Get - return builder or create new if it is empty
func (p *Pool) Get() *flatbuffers.Builder {
p.mu.Lock()
defer p.mu.Unlock()

select {
case builder := <-p.pool:
return builder
default:
return flatbuffers.NewBuilder(builderInitSize)
}
}

// Put return builder to the pool
func (p *Pool) Put(builder *flatbuffers.Builder) {
p.mu.Lock()
defer p.mu.Unlock()

builder.Reset()

select {
case p.pool <- builder:
// return to the pool
default:
// ignore
}
}

Now, let’s dive into the results of our tests, and here’s what we see:

json:

168706 7045 ns/op

proto:

651063 1827 ns/op

flat:

1681384 711.2 ns/op

Well, well, well — looks like Flat is the speed demon here, leaving the others in the dust by a factor of T. The numbers don’t lie, and it seems like our DIY optimization is paying off big time!

Now it’s time to put our protocols to the real test — we’ll spin up the services, hook them up with Prometheus metrics, add MongoDB connections, and generally make them full-fledged services. We might skip tests for now, but that’s not the priority.

In the classic setup, as mentioned earlier, we’ll have two methods — save and find by limit and offset. We’ll implement these for all three implementations and stress test the whole shebang using Yandex Tank + Pandora. (I’ll write a separate article about Pandora, how to use it, and how to write custom scenarios for load testing.)

To keep it simple on the graph side, I’m using a Yandex service called Overload, and I’ll leave links to our tests. Let’s get down to business!

Save method, 1000 rps, 60 sec, profile:

rps: { duration: 60s, type: const,  ops: 1000 }

json:
99% — 1.630 | 1.260
98% — 1.160 | 1.070
95% — 1 | 0.920
Links: first test and second.

proto:
99% — 1.800 | 2.040
98% — 1.380 | 1.540
95% — 1.160| 1.220
Links: first test and second.

flatbuffer:
99% —3.220 | 3.010
98% — 2.420 | 2.490
95% — 1.850| 1.840
Links: first test and second

And now, let’s throw in another method that covers that very case I mentioned at the beginning — we need to quickly extract a field from the request and validate it. If there are any issues, we reject the request; if it’s all good, we proceed.

Validate method, 1000 rps, 60 sec, same profile:

    rps: { duration: 60s, type: const,  ops: 1000 }

json:
99% —1.810 | 1.980
98% — 1.230 | 1.290
95% — 0.970| 1.070
Links: first test and second.

proto:
99% —1.060 | 1.010
98% — 0.700 | 0.660
95% — 0.550| 0.530
Links: first test and second.

flatbuffer:
99% —2.920 | 2.420
98% — 2.170 | 1.850
95% —1.540| 1.510
Links: first test and second

Conclusion:

Let’s break down why FlatBuffer suddenly became the ace here. Back in 2019, we were running experiments, trying to crank up our app several notches — that’s exactly what inspired this article. We switched from JSON to Proto2, then Proto3, but the real performance boost came only thanks to FlatBuffer.

Fast forward more than 4 years, and the devs and the community have juiced up Protobuf quite a bit. Now, our stack is rocking the high-performance Go language, not Kotlin with coroutines and Spring Boot. So, if you ever find yourself in a situation where quick serialization is the name of the game, keep an eye on FlatBuffer.

Serialization:

JSON: In the tests for the save method under a load of 1000 requests per second, JSON demonstrates consistent results with an execution time of approximately 7045 nanoseconds per operation.

Protobuf: Protobuf shows high efficiency, surpassing JSON, with an execution time of around 1827 nanoseconds per operation in the same test.

FlatBuffers: FlatBuffers stands out among the others, showcasing significantly lower execution time at around 711.2 nanoseconds per operation in the same load test.

These results highlight that FlatBuffers provides a significant performance advantage compared to JSON and Protobuf. Despite FlatBuffers requiring a steeper learning curve and more complex usage, its real-world efficiency emphasizes that investments in performance optimization can pay off in the long run.

Cases:

Looking at the tests, JSON still performs faster than the rest — the numbers don’t lie, right? Indeed, we built a service that saves everything to the database. But what if we need some network connection beyond the DB? gRPC would have more advantages because it operates on the HTTP/2 protocol. Read more about it here:

So, let’s wrap it up:
1) If you need to serialize data quickly, use FlatBuffers.
2) If you have many services and need to push requests between them, use gRPC — nothing seems to beat it in speed.
3) If you just need a JSON translator service from the phone to the database, go for REST + JSON.
4) If you need to conserve memory on the device and can wait a bit for processing on the backend, use FlatBuffers.

Looking at our metrics, we’re talking about 1–3 ms — just think about how fast that is!

I hope this has been useful for you. Thanks 🙏

--

--

Ilia

Lead software engineer | Kotlin/Java as main language and learning Golang and Rust | Try to become a rockstar