Serialization and Deserialization with Ballerina SerDes module

Mohamed Sabthar
Ballerina Swan Lake Tech Blog
3 min readAug 16, 2022

In this cloud era, communication between applications is a critical necessity. Most applications written in modern languages provide support for rich data structures to ensure effective and efficient memory usage. But it is not possible to send these raw data structures over the network and it is not effective to persist these raw data structures for the long term in the disc either. This is where the concept of serialization and deserialization comes into play

So what are serialization and deserialization?

Serialization is the process of translating an in-memory data structure into a format that can be effectively stored or transmitted over the wire. There are many different serialization formats available. To name a few: JSON, XML, SOAP, Protocol buffers and Apache Avro are some of the commonly used formats. Serializing according to these formats will ultimately result in a series of bits. The process of rereading these series of bits according to the serialization format and reconstructing the semantically identical clone of the original data (having the same data structure) is known as deserialization.

Serialization and deserialization in Ballerina

Ballerina supports JSON serialization format with built-in json type. The following example shows the serialization and deserialization of the tuple of records. For more examples check out Ballerina By Example.

The Ballerina SerDes Module

Even though JSON is the widely used serialization format, it is not quite efficient when compared with other serialization formats such as protocol buffers and Apache Avro (binary format). Hence, although JSON is commonly used in web applications, the latter two are widely used in internal services, owing to their efficiency (good compression and better speed compared to JSON). Ergo, the SerDes module brings in the capabilities to handle more than JSON for Ballerina users.

Ballerina distribution provides SerDes as a standard library module which includes more serialization formats other than JSON. SerDes currently supports protocol buffers (proto3) as a serialization mechanism. The API provided by the SerDes library is very simple and includes just two core methods

  1. serialize()
  2. deserialize()

Proto3 serialization using SerDes

To perform serialization we need to follow 2 steps.

  1. Create a Proto3Schema object by providing the ballerina type. This will internally create a protocol buffer message schema which corresponds to the provided ballerina type. Mapping between ballerina type and proto message type is described here.
  2. Call the serialize method on the create Proto3Schema object by passing desired data to be serialized as the parameter. This will result in a byte[] on success or a serdes:Error on failure

Proto3 deserialization using SerDes

Similar to serialization, deserialization also involves 2 steps.

  1. Create the Proto3Schema object in the deserialization end
  2. Call the deserialize method on the object by passing the encoded byte[] as the parameter. This will result in the desired ballerina type on success or a serdes:Error on failure

Comparing the file sizes generated by JSON and Protocol buffer format

As seen in the above image, the JSON file took 60 bytes while the protocol buffer encoded file consumes just 28 bytes, which is less than half the size of the JSON file. This simple example shows that it is effective to use protocol buffers over JSON. So choose a wise format depending on your use case.

Signing off with the hope that this helped you get an high-level idea of the serialization and deserialization concepts, and the role that the Ballerina SerDes module plays.

References

  1. https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats
  2. https://github.com/ballerina-platform/ballerina-standard-library/issues/2964
  3. https://ballerina.io/learn/by-example/json-type.html

--

--