Why not use Java Serialization

Antonio Jiménez
7 min readOct 21, 2021

--

Introduction

As you surely know, serialization can be used to save java objects into files or to send objects between services in a binary way. E.g: shared memory system, network messages stack, etc.

This post is inspired by the moment when I faced the impact in the performance of using the default java serialization. It was when I was implementing link connection between MAP stacks of a high-performance demanding telecom solution; similar to cross-links (C-LINK) of STPs used in the SS7 networks… but this stuff is out of the scope of this post.

Just for curiosity, C-Link is used to send messages between STPs when the messages of one session are send to different STP than the one where the session started.

C-Link between STPs in SS7 protocol

In this entry I’ll talk about:

  • The impact on the size of using the default serialization vs a manual one.
  • The impact on the performance of using default serialization vs a manual one.
  • A comparison with the ProtoBuf serialization

If you enjoy the entry, please like, and share it! Let’s start!!

What? More than 3x smallest than using default serialization?

For sake of simplicity, I’m going to use a simple structure of data for this example. The final result when these ideas are applied depends on the real project, so the difference can be bigger or smaller.

Model use in the study

Default serialization

Just to be clear, when I talk about default serialization I’m referring to let java SDK serializing all automatically. This is done using just the writeObject method of ObjectOutputStream. Calling this method, java will serialize the object and its children. (Do not forget that all the objects must implement the interface Serializable)

Here is the way to do the default serialization and the deserialization.

Java simple serialization

This method saves the class information of all the elements saved, which produces a quite big footprint as you will see soon.

Manual serialization

Manual serialization consists in replacing the previous use of the writeObject method with the use of methods that save primitive data. This way we add, to the ObjectOutputStream, the minimal data of each attribute.

Serialization:

Manual serialization of class Message
Manual serialization of class Player

Primitives values can be saved directly. To save the byte array, we indicate the size first and then the bytes of the element. This way, we can read it later. To save String we can use writeUTF, which internally saves the length and the bytes in a similar way.

Deserialization:

Message class deserialization
Player class deserialization

Size difference: Let se the numbers!!

Using the method above, we can obtain the byte array of the two methods and compare their lengths. Let’s see:

#DW# — [Default-Java] Serialization size 757
#DW# — [Manual] Serialization size 234
#DW# — Default-Java/Mean size relation: 3.24

As you can see the relation, in this case, is more than 3 times lower than the default way! (yes, the manual way is more complex, life is hard :P)

Tip: Java offers an interface that can help to organize or simplify the custom serialization, Externalizable. Check it out!

Use of zip compressor

If the performance is not important, but size still is, we can use a zip compressor instead of a manual serialization. See Annex I.

Performance comparison

And what about the performance?

To measure the performance I serialized 100k elements, but because the first iteration takes more time, I’ve used a loop with 10 iterations, to have several measures, and I’ve used the library “DropWizzar” to measure the time consumed to serialize the list and then obtain the mean.

Performance meassurement

Here you can see the different times:

#DW# — [Default-Java] Time to serialize 100000 messages 1244
#DW# — [Manual] Time to serialize 100000 messages 335
#DW# — [Default-Java] Time to serialize 100000 messages 647
#DW# — [Manual] Time to serialize 100000 messages 223
#DW# — [Default-Java] Time to serialize 100000 messages 658
#DW# — [Manual] Time to serialize 100000 messages 252

As result, the means are:

#DW# — [Default-Java] Mean time for serialization 707.42
#DW# — [Manual] Mean time for serialization 235.83

As you can see, the performance is about 3 times better if you use manual serialization.

What about ProtoBuf?

As is said on its page:

“Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data”

To have a complete vision about how it is used the best is to go to the official page and read theirs tutorials.

For sake of simplicity, I’m going to put here just the result of the usage of ProtoBuf… which amazed me!

After updating the code (including gzip serialization code) and adding a little more information to serialize, here are the results:

#DW# — === Size study ===
#DW# — [Default-Java] Serialization size 814
#DW# — [Manual] Serialization size 272
#DW# — [Zip-Java] Serialization size 581
#DW# — [ProtoBuf] Serialization size 259

ProtoBuf uses less data even than manual serialization, which is quite good. Maybe I could review the manual way that I’ve implemented… maybe I could improve it…

Let see what happened with the performance…

#DW# — === Performance study ===
#DW# — [Default-Java] Time to serialize 100000 messages 1482
#DW# — [Manual] Time to serialize 100000 messages 324
#DW# — [Zip-Java] Time to serialize 100000 messages 7080
#DW# — [ProtoBuf] Time to serialize 100000 messages 439

Oh, yes!! My manual implementation is the best…

Or not… Because after several executions, the mean of ProtoBuf is the best, so it is declared the champion of the performance!!

#DW# — == Performance mean results ==
#DW# — [Default-Java] Mean time for serialization 891.68
#DW# — [Manual] Mean time for serialization 307.52
#DW# — [Zip-Java] Mean time for serialization 8644.89
#DW# — [ProtoBuf] Mean time for serialization 125.42

Performance comparison

Conclusion

Default serialization is easy to implement. If size or performance is not a problem for you, go ahead, use it, but keep in mind that objects must be the same and can not change between different version of the product. Be careful to add SerialVersionUID to avoid that the compilator generate a new one if you introduce some change.

Manual serialization is more complex and you can introuduce error easyly. Maybe with some control and check you can update the object saved and read it from old serialization.

ProtoBuff is the champion. Configuration is not difficult. With some restriction you can update the serialized objects and allow that other version read the serialized data. Another good thing is that the serialized object can be read by other languages, so maybe this is the reason that is quite used as binary communication between services in the Microservices world.

I hope you enjoy the post!!

Annex I: GZip compression

To add the compression to the serialization process the easiest way to do that is using the GZIPOutputStream class of java.

Here is the code:

Serialization and zip
Deserialization with zip

And here are the results:

#DW# — [Default-Java] Serialization size 757
#DW# — [Manual] Serialization size 234
#DW# — [Zip-Java] Serialization size 534

As we can see the zip serialization reduces the size but is not a reduction as bigger than the done which the manual one. Here you can see the relation between the sizes:

#DW# — Default-Java/Manual size relation: 3.24
#DW# — Default-Java/Zip size relation: 1.42
#DW# — Zip/Manual size relation: 2.28

If we study the performance the penalty is quite high:

#DW# — [Default-Java] Mean time for serialization 892.66
#DW# — [Manual] Mean time for serialization 287.43
#DW# — [Zip-Java] Mean time for serialization 7669.71

--

--

Antonio Jiménez

Someone how loves learning new things and problem challenges.