How to write your own Borer based Akka Serializer !!!
Background
I have been involved in developing Common Software for Thirty Meter Telescope (TMT) project for three years now. Common Software or CSW mainly focuses on developing a framework or a set of libraries that software components of the Telescope will use to discover other components in the cluster, communicate with other components via messages, publish telemetry data, raise alarms to report if something is wrong, etc. We use Scala language and Akka framework for the same.
Akka, in its 2.6.0 release, recommends to use Jackson serializer for sending Actor messages across remote nodes. But we find it difficult to use in our application. Instead we use a library called Borer that suits our requirements. To briefly talk about Borer, it is a Scala based library providing serialization in binary format (Cbor) as well as text format (Json), just like Jackson. However, it uses static mechanism for (de)serialization, unlike Jackson. To understand why we do not use Jackson, you can refer the blog — Understanding Akka serialization with Jackson and Borer .
While using a static serialization library like Borer, we will require to write our own Akka serializer that describes how to (de)serialize Actor messages. We believe, this could be achieved in multiple ways. We have explored a few of them and in the process, we have evolved a pattern for abstracting Akka serializer helper that can be reused throughout the application. This pattern is not just specific to Borer but can be generally applied to other custom Akka serializers.
In this blog, I would like to share the experience of evolving Akka serializer helper. The code samples are in terms of Borer but they should be applicable to any custom Akka serializer.
Borer
Let’s say we have an Actor message Zoo
that we want to send across remote nodes. We will extend Zoo
from BorerSerializable
marker trait as follows:
final case class Zoo(primaryAttraction: Animal) extends BorerSerializable
Detailed steps for setting up Borer to (de)serialize Zoo
can be found here.
In the Borer setup for Zoo
, we will basically have an Akka serializer as follows:
and akka.actor.serializers
and akka.actor.serialization-bindings
configurations as follows:
Note that Zoo
is just one Actor message for which we are creating a BorerAkkaSerializer
, akka.actor.serializers
and akka.actor.serialization-bindings
.
This is not scalable for n
number of messages, if n
is a large number in your application.
Ideally, we should have a single setup for n
number of messages. Consider we have Park
and Theatre
messages along with Zoo
. Now, instead of writing BorerAkkaSerializer
for each one of them, we will assemble them in a single class as follows:
This facilitates having just one BorerAkkaSerializer
, akka.actor.serializers
and akka.actor.serialization-bindings
for n
number of messages.
CborAkkaSerializer
Once we have BorerAkkaSerializer
in place for all Actor messages, we can further simplify it.
Next, we just need to CborAkkaSerializer
as follows:
CborAkkaSerializer
allows us to abstract and hide the boiler-plate code from end-user.
To summarize this simplification, we can say that if we have
- a
BorerSerializable
marker trait and - a
CborAkkaSerializer
helper class maintaining theClass
->Codec
correlations - then we can have concise implementation of
CborAkkaSerializer
i.e.BorerAkkaSerializer
- and configure it with
akka.actor.serializers
andakka.actor.serialization-bindings
To know more about the Serializer helper pattern, you can refer the Borer Github issue.
Caveat
If we notice, CborAkkaSerializer
maintains a Class
-> Codec
co-relation. It will contain entries like:
Zoo -> zooCodec,
Park -> parkCodec,
Theatre -> theatreCodec
Here, each codec defines the encoding and decoding logic for the class type.
While (de)serializing an Actor message, CborAkkaSerializer
fetches the appropriate Codec
against the Class
and undergo encoding/decoding of the message.
But first, to find out the Class
type of an instance, CborAkkaSerializer
iterates through the list of Zoo
, Park
and Theatre
and uses a java method clazz.isAssignableFrom
. isAssignableFrom
method checks if this
class is same as or super type of the given parameter
class. In the worst case, this iteration will happen n
times for n
number of messages. This can pose a potential performance issue while sending Actor messages across.
To understand more about the problem, let’s consider the below scenario:
So far, Zoo
, Park
and Theatre
are simple case classes. But let’s say ifZoo
was a sealed hierarchy (or Algebraic Data Type).
Then, while sending NorthZoo
Actor message, CborAkkaSerializer
will iterate through theClass
-> Codec
mapping until it finds the class type of NorthZoo
( i.e. Zoo
) and use the appropriate codec ( i.e. zooCodec
) for serialization/deserialization.
To fix the problem, we should modify our Class
-> Codec
mapping such that it has following entries:
Zoo -> zooCodec,
NorthZoo -> zooCodec,
SouthZoo -> zooCodec,
Park -> parkCodec,
Theatre -> theatreCodec
This will require just a lookup of Class
to fetch the Codec
against it.
But, Java reflection by default does not provide a way to capture the subtype of a base class. We can use a third-party library called reflections that provides a convenient method to capture subtype of a class. The usage of reflections library in CborAkkaSerializer
can be found here.
The whole discussion about the evolution of CborAkkaSerializer
helper can also be referred from this Borer Github issue.
I hope this blog explains CborAkkaSerializer
helper in detail and helps you write custom Akka serializer using it.