Logging — its hard to make it right, part #1

5 min readJun 22, 2018

Many developers use Java logging, it’s a reliable component, a “must have” in each solution. It helps in tracking how solutions work and it helps to answer the main question: are everything works according to design or not? But how many of you found that after the 2–3 months of the development logs become less and less readable? The “noise” inside the logs hide important messages and reduce your ability to maintain the code or develop new features.

I try to create a small set of rules that will help to organize logging better. Its structured in top-5 questions that each app developer should ask at the beginning of a project. Right now I try to keep it abstract, leaving some space for future implementation in Java & Kotlin (actually lib that satisfies all the requirements described in this article is developed and successfully used in the project for last 2 years, it helps to catch on very earlier phase a lot of issues). So lets start:

(1) All log points should be categorized. Create a list of categories that well describes the logic behind them.

How to choose what will become a category?

A Category is a logical grouping of events. What logical abstractions do we have in a project? classes? packages? maybe something else? The secret is in finding correct answer — its “data flow”. Usually, each business process in a project has own “data flow” or even several of them. In mostly all cases logic of “data flow” interacts with several abstraction layers, which makes a grouping of logs so hard. But when you want to track something, you usually want to track a specific “data flow”.

What to do if one line of logs belongs to several categories?

The answer is quite dummy: Log message should be marked that it should be included into multiple “categories”.

What to do with package/class name? Traditionally its a filter for Logger calls.

“Data flow” approach changes the traditional logging filtering. Better to say by filtering packages and classes during logging we will control the level of verbosity.

Good example: in most cases we can skip logging of “factory” call. It does not open anything new to us, only adds some noise to our logs.

(2) Different logger levels should be used for increasing or decreasing the number of log messages.

How to determine what to log on each level?

Traditionally Java logging introduces several levels of logging: FINEST, FINER, FINE, CONFIG, INFO, WARNING and SEVER.

Slf4j for example reduce the number of levels to five: TRACE, DEBUG, INFO, WARNING and ERROR.

Android Log class works with six levels: ASSERT, DEBUG, ERROR, INFO, VERBOSE, WARNING.

I usually assign to each level several responsibilities:
VERBOSE/TRACE — IN/OUT data, Allocate/Destroy of Objects, Cache hits; EventBus Events; Touches;
DEBUG — Chain of calls; START/END of method call; performance; Lifecycle;
INFO — Change of business state, configuration, amount of data;
ASSERT/WARNING — wrong state, validation failure, captured exceptions, timeouts;
ERROR — wrong parameters, un-handled exceptions;

Context. What is the context of each log message?

Most projects have abstraction like User. This is a good candidate for including into context of each log event. Depends on tasks that you are solving, the number of data included into context may be different.

Can log message be self-describing? or should depend on earlier log messages?

Logs are for humans. So yes, try to keep each log event self-describing and easy to read. But try to avoid duplication of data. Context for example can be logged at the begging of the “data flow” only once per session.

(3) Logs should provide all information required for repeating the same actions on the testing environment.

Recover the chain of calls;
Dump initial context of the “data flow”;
Recover the multi-thread processing picture;
Recover a list of classes/packages involved into “data flow”; small example of configuration matrix can simplify understanding.

This is small dummy sample, nothing even close to real one. In real system number of packages and classes easily can jump over 1000 columns. In cells should be defined as recommended level of logs: T, D, I, W or E.

(4) Don’t mix logic of other modules with Logging. Very often Statistics and Tracking mixed with logging in a bad way.

Statistics — used for collecting metrics, that allows analyzing performance, health and business effectiveness. Logging can be used for dumping this data, but log events are not the storage of that information.
Tracking — used for tracking user actions and system response. Should be used for defining “business funnels” and tracking runtime state of the system. Logging can be used for dumping this data too.
Primarily role of logging is to monitor the execution of system and detecting it error states, helping in debugging the states of logic.

(5) Logs reading is the most difficult task.

Usually needed only the dump of one “data flow”.
Usually, a system at the same time may be processing thousands of users, so filtering by context is the most effective way of limiting the scope of analysis.

Global logging configurations often play a bad role. Just imagine that you have 100 users that passing at the same time defined for logging “data flow” — each user request will generate thousands of log events. More users you have, more logs will be generated. In most cases, we need to log only specific user(s). The most advanced logging systems automatically adjust the level of verbosity based on specific triggers, like raise of exception, timeout or achievement of “honeypot”.

multi-threaded processing is hard to visualize in traditional “console” logging displaying. Required timelines/”swimming lines” for each thread.

So what do we have in total

define “data flow” in your system, name them correctly
identify packages and classes that are participating in each “data flow”. Draw diagram — it will help to see the picture clearly. Create a matrix, it will show an overlap of the categories.
put log messages into classes with specified category flags.
define context of log events
define condition of raising verbosity of the logs
during passing the “data flow” logs should run with full verbosity, at the end point system have to decide which “verbosity filter” to apply and how much to store in general system logs. This can be treated as logging into temporary storage (memory or disk) and than post-processing with filter applying.

Next post will be a real life example of such logging approach… stay tuned.

If I miss something important feel free to point me on that. This is an open discussion of best approaches for real life commercial projects.

Logging — its hard to make it right, part #1

Written by Oleksandr Kucherenko