Recently my team started adopting
cats-effect into production code. There is a great ecosystem of libraries for almost anything. Logging is no exception — there is an awesome library
log4cats that wraps log requests to the IO context. But it lacks one important feature, which we required — an ability to freely pass and modify a logging context (MDC) through the business logic.
In this article, I want to describe the approaches we considered to implement that functionality. If you are the ‘show-me-the-code’ person, you can go straight to the second part, where I’ll show how to build the final solution with the following API:
As you can see, in this example, the logging context is being populated at the lower stages and gets cleared from values specified there, i.e.
stage2 has in its context the
id value populated from ‘upstream’, and
stage3 also has the
id value, but doesn’t contain the
Techniques of context passing
For solving this problem, I come up or either found the following three techniques:
- Implicit variables
- Logging algebra on top of
DynamicVariablemagic (the equivalent of
While discussing the approaches let’s concentrate on two important principles for the SOLID acronym — Single responsibility and Open/Closed principles. We want to design the solution in a way that doesn’t break existing functionality and doesn’t introduce additional concepts into existing logic (besides context manipulations).
This is the simplest, naïve, technique, that relies on Scala’s implicit variables mechanism.
The idea is to define an ultimate context holder class, say,
LogContext, and update all your methods signatures to accept an additional argument —
But this technique has three major flaws.
- You must provide the context to the
Loggermanually since it doesn’t know anything about your implicits.
- Imho, you have less fluent API.
- But what’s more important, that suddenly your interfaces and business logic get introduced to the new concept — logging context.
Having done that you sing an evil pact to support this for the rest of your life. Imagine the whole codebase becomes interleaved with this concept, each interface and implementation ‘knows’ about logging. You won’t be able to refactor it without spending hours if not days, also you may have issues with multiple implicits. And… your code is not scalable either: say, you need to add a new concept — metrics context.
To summarize, implicit variables — is a simple technique for passing context, but it isn’t safe, since you may forget to pass updated value, and it infects your nice and clean API.
Since you are already in a functional space, you could go deeper into FP jungles and encode the logging using available there approaches.
In short, the idea is to use
IOLog[A] = WriterT[IO, Vector[LogEntry], A] instead of plain
IO[A]. Instead of calling logger directly, you would ‘write’ log entries to the
WriterT context, and at some point of your application you would use an interpreter for the logging language (
WriterT[IO, Vector[LogEntry], A]) to interpret it to the
IO[A], which would call the logger methods.
There is an article that explains this approach in the details.
The good part is that your logging becomes testable. You have access to the
LogEntries being created during execution since they are buffered in the
Although, I see four major problems with this approach.
- Now you have to use the
IOLogwhere you used the
IO, so all your for-comprehensions operate on another monad. This means, that you cannot use the plain
IOmethods, you have to
liftthem to the new context. Let’s look at the example:
A function that is not in the
IOLog context has to be
lift'ed. Imaging adding support for such logging in a big project. And IMHO, those
lift’s reduce the readability of your code.
2. You have to manually construct the
LogEntries, together with the required MDC, and it doesn’t satisfy the requirement to populate the context between calls.
3. You give up an ability to log the time of an event (unless your events contain the time). This happens because you are not logging the event directly, but rather put it into
WriterT buffer and log everything in a batch.
4. The buffer may grow huge and consume a lot of memory, so you have to be cautious and pick logging points wisely.
DynamicVariable is a wrapper of Java’s
ThreadLocal. ThreadLocal allows you to create special variables, modification of which can only be seen by the thread which makes those modifications. This mechanism can be used to bypass some interface restrictions and pass additional arguments to methods, or put values which should not/cannot be seen directly:
- In general, this is a very bad idea to use ThreadLocals for such cases and it’s a good indicator of a ‘smell code’. DynamicVariable/ThreadLocal is a dangerous mechanism since the compiler cannot restrict usage or ensure that invariants hold.
- DynamicVariable should be wrapped into additional logic to populate the context between threads whenever required. Typically, this means the usage of Aspects.
In case you don’t know what does
context shift do in terms of IO, it yields the thread for other tasks and asks for other threads to continue interpreting the original IO. I recommend you to watch this talk by Fabio Labella.
But in spite of its danger, DynamicVariables and ThreadLocals find their niche at such tasks as transactions, logging, metrics collection, in fact, MDC is itself a ThreadLocal storage.
In the second part, I’ll show how to implement the LoggingContext based on the third approach since it doesn’t pollute interfaces and for-comprehensions, and all business logic can be completely unaware of logging context (except the cases when they set one).