Serialization and Deserialization: How data travels in a computer network ?

Hatim Zahid
3 min readFeb 18, 2022

--

Why JSON is chosen ? What is TCP ? What is a byte string and byte stream ?

Photo by Markus Spiske on Unsplash

Serialization is the process of converting a data object into a byte stream.

You will find this info in a lot of articles but they all explain using buzzwords. This creates more confusion. In this article I will try to explain the concepts in a question/answer framework. Stick with me!

What is Serialization ?

Serialization is the process of converting a data object into a byte stream. Serialization converts objects in any programming language to 1’s and 0’s that can be understood by any computer hardware irrespective of the
language they are using.

What is a byte stream?

A stream of data where the smallest independent unit of information that makes sense is a byte is called a byte stream. For example sending RGB data. Each color is 8 bit, 1 byte long. Data is sent byte by byte serially.

What is a data object ?

A type of object created in a programming language for storing data. Examples include integers , strings, arrays, dictionary etc.

Why do we need serialization ?

To transmit data simply or to store data. Data can only be stored in bytes or transmitted as bytes or bits.

What is a bit stream ?

A pipeline filled with data where the smallest independent unit of information that makes sense is a bit , is called a bit stream.

What is deserialization ?

Deserialization is the process of converting byte stream into a specific object of a programming language. Hence a de-serialized object in python is different from a de-serialized object in Java. Notice we have deserialization libraries in major programming languages.

What’s the connection between JSON and serialization ?

JSON is a string format representational of byte data. JSON is encoded in UTF-8 ; meaning 8bits/1 byte as a independent data unit. So while we see human readable strings, behind the scenes strings are encoded as bytes in UTF-8.

JSON is just an human readable front-end format that applies UTF encoding at the back-end. Hence data is serialized as JSON when transferring on networks.

What is byte string vs byte stream ?

String encodings of bytes are called byte strings. Byte stream is a circle and byte string is a small circle inside this big circle. Hence these both phrases are used interchangeably.

Serialization means converting an object into a byte stream. Its the same as saying data is changed to JSON format. Hence data is serialized using JSON format. The serialized data can then be sent in any protocol you want to for example TCP and UDP.

What is the connection of TCP and HTTP ?

HTTP is an application layer protocol which uses TCP as transport layer protocol. TCP guarantees in order transmission, so that files can be reconstructed at the receivers end. TCP has three way handshake.

What is an Application layer protocol ?

This is the layer that uses the transport layer to transfer data. Application layer example is HTTP protocol. This is just a set of rules for an application like a web browser to transmit and receive data. Every application for example a client application or the server application adheres to these rules. Adhering means implementation of rules.

What is a Transport layer protocol ?

This means the software protocol that actually transports data. The set of rules include strategies for sending and receiving correct data and error correction in data. The end user doesn’t even think about how error correction or handshaking works. The implementation of these sits in the transport layer. Examples are TCP and UDP.

What to do now ?

If you learned something new, clap and follow me !!

Want to connect ?

--

--

Hatim Zahid

An engineer who loves to travel, loves animals and loves philosophical conversations !! Ready anytime for a chat. https://medium.com/subscribe/@hatim.zahid