Not Functional Logging context (p. I)

Vladislav Gutov
5 min readFeb 12, 2020

--

Recently my team started adopting cats-effect into production code. There is a great ecosystem of libraries for almost anything. Logging is no exception — there is an awesome library log4cats that wraps log requests to the IO context. But it lacks one important feature, which we required — an ability to freely pass and modify a logging context (MDC) through the business logic.

In this article, I want to describe the approaches we considered to implement that functionality. If you are the ‘show-me-the-code’ person, you can go straight to the second part, where I’ll show how to build the final solution with the following API:

As you can see, in this example, the logging context is being populated at the lower stages and gets cleared from values specified there, i.e. stage2 has in its context the id value populated from ‘upstream’, and stage3 also has the id value, but doesn’t contain the value from stage2.

Techniques of context passing

For solving this problem, I come up or either found the following three techniques:

  1. Implicit variables
  2. Logging algebra on top of IO (an IOLog monad)
  3. Scala’s DynamicVariable magic (the equivalent of ThreadLocal in Java)
Single responsibility principle. Photo by https://unsplash.com/photos/OopPIi_A428

While discussing the approaches let’s concentrate on two important principles for the SOLID acronym — Single responsibility and Open/Closed principles. We want to design the solution in a way that doesn’t break existing functionality and doesn’t introduce additional concepts into existing logic (besides context manipulations).

Implicit variables

This is the simplest, naïve, technique, that relies on Scala’s implicit variables mechanism.

The idea is to define an ultimate context holder class, say,LogContext, and update all your methods signatures to accept an additional argument — LogContext.

But this technique has three major flaws.

  1. You must provide the context to the Logger manually since it doesn’t know anything about your implicits.
  2. Imho, you have less fluent API.
  3. But what’s more important, that suddenly your interfaces and business logic get introduced to the new concept — logging context.

Having done that you sing an evil pact to support this for the rest of your life. Imagine the whole codebase becomes interleaved with this concept, each interface and implementation ‘knows’ about logging. You won’t be able to refactor it without spending hours if not days, also you may have issues with multiple implicits. And… your code is not scalable either: say, you need to add a new concept — metrics context.

To summarize, implicit variables — is a simple technique for passing context, but it isn’t safe, since you may forget to pass updated value, and it infects your nice and clean API.

Logging algebra

Since you are already in a functional space, you could go deeper into FP jungles and encode the logging using available there approaches.

In short, the idea is to use IOLog[A] = WriterT[IO, Vector[LogEntry], A] instead of plain IO[A]. Instead of calling logger directly, you would ‘write’ log entries to the WriterT context, and at some point of your application you would use an interpreter for the logging language (WriterT[IO, Vector[LogEntry], A]) to interpret it to the IO[A], which would call the logger methods.

There is an article that explains this approach in the details.

The good part is that your logging becomes testable. You have access to the LogEntries being created during execution since they are buffered in the WriterT context.

Although, I see four major problems with this approach.

  1. Now you have to use the IOLog where you used the IO, so all your for-comprehensions operate on another monad. This means, that you cannot use the plain IO methods, you have to lift them to the new context. Let’s look at the example:

A function that is not in the IOLog context has to be lift'ed. Imaging adding support for such logging in a big project. And IMHO, those lift’s reduce the readability of your code.

2. You have to manually construct the LogEntries, together with the required MDC, and it doesn’t satisfy the requirement to populate the context between calls.

3. You give up an ability to log the time of an event (unless your events contain the time). This happens because you are not logging the event directly, but rather put it into WriterT buffer and log everything in a batch.

4. The buffer may grow huge and consume a lot of memory, so you have to be cautious and pick logging points wisely.

Dynamic Variable

Scala’s DynamicVariable is a wrapper of Java’s ThreadLocal. ThreadLocal allows you to create special variables, modification of which can only be seen by the thread which makes those modifications. This mechanism can be used to bypass some interface restrictions and pass additional arguments to methods, or put values which should not/cannot be seen directly:

  1. In general, this is a very bad idea to use ThreadLocals for such cases and it’s a good indicator of a ‘smell code’. DynamicVariable/ThreadLocal is a dangerous mechanism since the compiler cannot restrict usage or ensure that invariants hold.
  2. DynamicVariable should be wrapped into additional logic to populate the context between threads whenever required. Typically, this means the usage of Aspects.
Context population on IO.shift

In case you don’t know what does context shift do in terms of IO, it yields the thread for other tasks and asks for other threads to continue interpreting the original IO. I recommend you to watch this talk by Fabio Labella.

But in spite of its danger, DynamicVariables and ThreadLocals find their niche at such tasks as transactions, logging, metrics collection, in fact, MDC is itself a ThreadLocal storage.

In the second part, I’ll show how to implement the LoggingContext based on the third approach since it doesn’t pollute interfaces and for-comprehensions, and all business logic can be completely unaware of logging context (except the cases when they set one).

--

--