Nanosecond scale logging system

80 millions log messages per second!!
You know Log4J achieves a throughput of only 1.5–2 million log messages per second ?

Dhanya Krishnan
2 min readAug 28, 2024

Nanolog is a nanosecond scale logging system that outputs logs in a compacted binary format at compile time, and utilizes an offline process to re-inflate the compacted logs. The log analytics applications could directly consume this compacted log if needed and achieve 8x due to I/O savings. The whole optimisation revolves around the fact that shifting the work at runtime to pre and post execution can improving the latency in a humangous scale.

So what do they do differently ?

Preprocessing:
All static code in our code base are replaced with a unique ids.
So obviously this mapping of id->static code needs to be stored somewhere => Store it in a metadata file , one file per source code file.

Runtime:
During runtime, the ids & dynamic values in log are stored in a staging buffer which is compacted and pushed to disk. Different threads will need to write logs. Avoid contention of staging buffer by using different staging buffers for each threads. In addition, they take handle the cache coherence in a very intuitive way. (Cache coherence, what and why : https://lnkd.in/eqv8Zi_v)

Postprocessing:
Decompress the compacted logs and replace the actual values of log that we have stored in the metadata file.

3 main observations I found useful are:

1. Different staging buffers for each thread to avoid lock contention — no synchronisation needed, & hence low latency. For example like TLS- ThreadLocalStorage in Java.
2. Differs or puts off the log formatting and chronology sorting! (How much we give importance to logs being sorted when we see log, but sorting order is important when we SEE the logs, not when log is written! How flawlessly they have leveraged this fact! )
3. Encoding using a nibble for metadata — Lets say we have a 4 byte integer variable in code — but its value is 200, so it can actually be represented by 1 byte (0- 255 is 1 byte) — So we could save 4x in compaction.

Here is a reference to the the Nanolog paper if you are interested: https://lnkd.in/eRyXFjR3

Which logging systems do you use in your projects?
Do they add too much to your latency?

#latency #logging #nanolog

--

--

Dhanya Krishnan
Dhanya Krishnan

Written by Dhanya Krishnan

Software Engineer passionate about distributed system design, scalability & latency. Always reading the next research on distributed systems !!