The Promise and Peril of JSON logging

Oakley Hall
7 min readMay 3, 2024
Photo by Luis Soto on Unsplash

Picture yourself on a P1 incident call with 60 people. You’re triaging a revenue impacting failure of your service. The client is crashing on 1 out of 3 requests, but your system is not alerting. You reach for your server logs to determine the reason for the failures but you find nothing useful. You’re just logging the route and status code which are all 200s. A pit opens up in your stomach as your boss demands an update.

What are logs?

As developers, we’re always logging. Sometimes it’s a console.log(“here”) used for local debugging, sometimes it’s detailed HTTP logs used for ad-hoc queries on production app behavior at scale. In either case, the greater the detail and organization of your logs the better the picture you get of your application state.

There are many ways computers can tell us what’s going on. One is by simple single line statements that we define. See the above console.log(“here”) or maybe something slightly more helpful like log(“GET user/1234”) to indicate that our server has responded to an HTTP call with a GET verb and a defined route.

There are many types of structured log formats that take this approach. For example, in Common Log Format (CLF) that might look something like this:

127.0.0.1 — — [01/Jan/2022:12:34:56 +0000] “GET user/1234 HTTP/1.1” 200 1234

Here we have even more details, like the client IP, date and time, status code and the size of the object returned. This is much more useful than an ad-hoc statement like console.log(“here”)!

Apache log format is even more better, with info like Referer and User-Agent

127.0.0.1 — — [01/Jan/2022:12:34:56 +0000] “GET user/1234 HTTP/1.1” 200 1234 “https://www.example.com" “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36”

These log formats offer some relevant details about an HTTP request. This is enough to get a basic idea of what is happening on our service and what clients are calling it. We can do things like alerting on problematic status code patterns and identifying issues with categories of clients. This format is also very easy to read and search using command line tools. The limited number of characters also has the advantage of consuming less disk space, so it’s cheap and easy to deal with. If our application is small and our only log analysis is going to be done by hand, examining individual log files, this format is probably ok.

Logs are not good for everything

At scale, however, our needs change. First, we’re going to want real time monitoring of our application. Application logging is NOT ideal for this use case. Rather, we want to use very structured events and tools optimized for monitoring time series data. In this way, we can effectively monitor error codes and latency over time. Publishing something like Prometheus metrics is a better, cheaper and faster choice than relying on our logs here.

Logs can be really good for some things

There will be many times when we need to perform ad-hoc queries on our system’s behavior in order to debug some issue. For these kinds of problems, relying on a limited set of predefined data points is not enough. We need to be able to find the problematic transactions in our logs and be able to see the application state at the time of the issue. One method of reading application state is serialization. Serialization is the process of converting an object in memory into a form that can be stored or transmitted, like a string. If we’re going to log application state, we want to serialize some aspect of it so that we can store that in a file and review it later.

Enter JSON

As NodeJS developer, I was introduced to JSON formatted logs early on. In JavaScript everything is an object so if you want to serialize any data structure such that you can log it, the output would almost always be JavaScript Object Notation (JSON)

If we look at our our very simple HTTP log in JSON, it looks like this:

{“timestamp”: “2022–01–01T12:34:56Z”, “ip”: “127.0.0.1”, “method”: “GET”, “url”: “GET user/1234”, “status”: 200, “bytes”: 1234}

A little messy to read, but we can do something called pretty printing to make it more readable.

Ah, that’s better. But wait, how does this help us? It’s the same stuff we had in those lighter, easier to read single line logs, right?

Yes, but JSON offers the magic of nested attributes. So, we can log detailed elements of our application state as deeply as we want. Let’s take an HTTP call for instance. We can add a lot more detail to our log event.

Let’s look at this example of a POST request adding a user to our database, we’re now logging a bunch more information about the request like headers and request and response payload details.

Great, so you have a bigger and more complex log. How does that improve anything? Envision a scenario where were seeing a spike in errors on this HTTP call. Most requests are succeeding but some significant percentage are resulting in 500 errors. With the lighter weight logging we have access to those error codes, and we can track the frequency, but it tells us nothing about what might be going wrong. With this more detailed log format, we can analyze our http calls for common factors.
Looking at our logs we might see that the 500 errors are correlated with malformed emails and it guides us to examine client side email validation and server side validation and error handling. There are an endless number of issues we can diagnose with the ability to capture more details about our transactions. But it’s not all sunshine and rainbows, there are tradeoffs to be considered.

Advantages of JSON log formatting

The first big advantage is flexibility. A JSON format allows us to put arbitrary nested elements into our logs. This allows us to serialize and view data structures within our app and be able to observe the state of an operation at any point. This is incredibly powerful if used correctly.

Another big advantage is that JSON is a structured format for aggregation and parsing. What does that mean? Remember we discussed looking at logs on individual server hosts via the command line? Well JSON doesn’t lend itself to that kind of thing. Instead JSON is well suited to log aggregation with tools like Greylog, Splunk or Elastic/Logsearch/Kibana. Using a log aggregator that you can query across servers and capture behavior across the whole system unleashes the real power of JSON logging.

Another big advantage of JSON logging is interoperability with a variety of systems. JSON is an object notation format that is widely used. Most logging libraries will support JSON and all major languages have built-in or library support for JSON serialization.

Sound pretty good, right? Well let’s look at some caveats.

Perils of JSON logging

A big concern is the size of JSON logs. Verbose logs of any kind can lead to big increases disk space utilization and associated cost increases. Additionally, data structures that are serialized without care can contain massive nested objects and even binary files. If you’re not careful you can fill up your disks and log aggregations service with useless cruft logged inadvertently. This can have performance and availability impacts when logging to disk and serious cost implications when using SAAS log aggregators or cloud storage. Careful consideration also must be given to log retention periods, both on disk and in cloud storage medium.

Security and privacy are another major concern. Serialization of application state or HTTP payloads can easily contain secrets like auth tokens and database keys. Additionally, it’s very easy to carelessly log sensitive customer data like PII (personally identifiable information) or even payment card details. As your application and logging grow more complex, it is absolutely imperative to develop a filtering system to prevent this. Letting secrets and PII get logged to your file system or worse leave your network and make their way into a log aggregator creates a significant vulnerability. This is a problem to solve before you start serializing objects in your application.

There are also performance considerations for our applications. The operation of serializing objects is not free and depending on the language and design there are potentially significant costs. We may see spikes in CPU utilization if we’re dealing with complex nested objects or large amounts of data. Memory consumption could go up due to the need to create temporary buffers for the data before it is written to disk. Performing disk writes or network transfers can introduce latency and slow down our application. We can mitigate some of these performance issues by using efficient libraries and being careful about the size of objects we try to log. Additionally, we can use techniques like batching logging operations or performing the serialization and writes asynchronously to reduce the impact on application performance. Continually monitoring and tuning logging performance is important.

JSON to the rescue

Now envision a world where you’re on that P1 call and you reach for your server logs. They are nicely aggregated and parsed and you start to query them based in client ids for devices reporting issues. You immediately see there’s an attribute missing from the response payload to impacted clients and are able to trace the issue to a recent release. You roll it back and raise a ticket to implement API contract testing! Crisis averted and everyone can go back to bed.

JSON logging is a powerful tool that can help us understand our application and fix problems. Like any powerful tool, however, it can be used unwisely at great cost. It gives us flexibility in capturing detailed information about our application and makes it easy for us to use tools to analyze this data. But we need to think carefully about the potential pitfalls, like log volume, security risks, and impact on application performance. By managing log size carefully, putting good filtering systems in place, and making our serialization and write operations as efficient as possible, we can really take advantage of JSON logging while keeping the drama to a minimum.

--

--