Who reads logs?
Logs are a ubiquitous aspect of software. With time they have evolved from the main interface of programs to a sadly neglected side-effect of applications. By understanding who reads them and what they expect of logs we can adopt a better approach that not only increases the control we have over our software products, but when applied correctly — saves time, money, frustration and reduces risk.
This article addresses software teams. Especially those who concern the development part of the software lifecycle: All people from developers, architects, leaders and product managers.
I myself am a hands-on architect, a consultant and an agile evangelist.
In 2005 I faced a job-interview question that asked what is the common basis that all software infrastructure should provide, that most software take for granted. In the discussion we counted few things, like configurations, interfaces, and obviously — logs. Rightly so. The log is so fundamentally basic that tutorials of most languages start with printing “hello world” using print, echo, system.out, console, and such stdout facilitators.
Hello, developer, have you ever wondered who were you greeting?
Logs are so fundamentally basic that we came to expect it to be a side-effect of every code we run. But what concerns me is that in too many cases logs too often are practically expected to be some irrelevant background noise that nobody should ever have to read, and therefore it’s treated like a kind of a feature that should be suppressed by default or turned off all-together.
Maybe they are irrelevant for most users. Maybe we did come to a world where delivered software is expected to work and perform indefinitely and without maintenance, keeping it’s users completely oblivious to everything under the hood — and that’s a good thing. However — the few that still do have to read logs — albeit a fraction of a percentage of the users — they still have to read them, and deserve serious consideration, especially because they are the ones that the wellbeing of the product depends on.
Stakeholders of Logs
While it’s impossible to make observations about the output of all running code whatever it is — I will still a lean generic attempt it here, and list the stake-holders of logs.
I will confess to be biased towards servers, so be warned. I’m also biased towards developers as — it sounds like it is their engagement and attitude I wish to nourish, but not only: Product Managers, Program Managers, QA and Success Managers — all have a part in how logs look , what role they play, and how much to invest in them.
Logs start with the developer. Developers read logs when they develop. They do get to read logs after they deliver — but this scenario is unfortunately less common than it should be. It’s like that code-quality paradox:
Well made code is by definition never get’s appreciated. Software is best appreciated when nobody has to look at it’s code. If somebody had to look inside — it’s already because something went wrong: it’s either not working as expected, or does not do enough.
It’s the same with logs, but worse. If somebody is reading your logs — it’s safe to assume they are already in distress. The least you could do is make reading pleasant for them.
The big problem is when developers do not comprehend that logs are part of their product. Many developers believe that logs are some side-effect, some dev-tool or some regulatory requirement. But logs have end users that deserve consideration.
Even if your logs do not address your end-users — they ARE used by somebody, even if it’s currently just you. Give the reader a good break!
Whenever Ops are involved — mostly when servers are at stake— Ops read logs to make sure the service they just deployed for you started OK. They sometimes make sure it had shut-down gracefully. They also read logs when some KPI fell out of boundaries and the troubleshooting know-how does not help, and they have to make-do on their own.
Sometimes these Ops are not deploying for you — they deploy your distribution to their on-premise environment — and there your situation is far worse. Any hiccup in your logs results in far greater mountains of frustration, without you there to tell them — hey, it’s just a warning…
Support & Customer Success
Support Engineers, Account-Managers, or in their modern name Customer-Success — read logs too. Whether it’s to search for an error a customer complains about your production, or to help with a problem an on-premise distribution — they dig their noise in. You’ll be amazed what stories they make of the logs they read, and how far from the truth these stories could get.
In the end of the day they are thinking minds with a job to make sense of all that and build a narrative that will help them find some solution to a user problem. But you don’t have time to explain them every log line and when it occurs, and they don’t ask. You can’t afford to be unclear with these guys. They tell their stories to users.
In some occasions — users read logs. The obvious example are CLI tools, and it’s true that in modern days most tools conduct through a GUI, but even with GUI — who of us never got curiosity to clicked the “more info” button that some installers provide?
Well, it does not end there. Let’s just acknowledge that a service is a CLI tool that binds IO: web ports, TCP connections, message-queues connections, etc. Even if a message that a tool ran successfully is taken for granted in such a way that it does not require any notion to the log (and all messages are processed successfully) then— the startup and shut-down of a service should be on the same status like the output of a CLI tool.
Big Data Analysts
These guys look for patterns and cling to whatever they find to construct their narrative and back it with data. Often they find themselves with petabytes of log data, and a requirement regarding some business value to pull out of it.
They just as well might miss the
dfdfdfdffff you spat to the screen, or find it and keep to themselves what they think of it. But they are also thinking minds with a job to build a narrative and interpretations of their own and salaries that compete with developers. With these guys ambiguity costs more than unclarity…
Testers & QA
QA comes to determine if the software-under-test qualifies or not. No middle.
They enter the circle much sooner than Ops and support, and the agile evangelist in me tend to consider them developers — but they still deserve a section of their own. I count them here, after Support and Data-Analyzers — because their use-case is a mash of both: They browse through the outputs of the software — which often include logs — and interpret them to build a narrative they can use to affirm their claims that the software works as expected (…or unfortunately not).
In the better scenarios QA are indeed a part of the development team and know personally every log line — but this is sadly yet to become the norm.
In the even better scenarios QA have specific concrete log requirements the software is expected to emit on such and such cases. Sadly, as a hole — we’re not there yet as an industry. PMs, Ops, and QAs do not know to require log entries, and developers often neglect to provide them. Thus and alas — in the majority of cases they are kept unmanaged, unchecked and hung on good will and personal initiative.
As developer you want them to be able to find problems for you before they get too far and cause damage. As QA you want to be able to trust these logs. As PM you want to ensure they exist.
Well, this is a general term for batch-jobs, ETLs, data-reducers and even AI. Many of them are directed and/or programmed by Analysts, and sometimes by developers themselves.
In the better cases there are bots that are put there by Ops and do some live monitoring. In the best scenario — the title DevOps is understood like it should be — Ops coded by Developers (unlike Ops that develop automations — which is better than manual rollouts, but still not what DevOps was really meant to mean) — and all that BI & monitoring pipelines is handled close to home by bots put by the same team who writes the logs.
But although this scenario gets more and more common — it’s sadly still very far from the norm. In glorious cases — logs of CI-envs and QA are scrutinized by automation bots provided by QA specialists.
The takeout here in this stake-holder — if you wonder — is that making logs entries recognizable and traversable for machines pays a great deal.
Now we have to face the elephant in the room (at least this one, anyway):
Hackers read logs. White hackers, black hackers, purple hackers — all of them.
They deserve a section of their own because they are not really interested on whether your software works as expected, but on what valuable information they can pick up. Sometimes, the information itself is the goal of the hack. In other times it’s operational secrets required to support further hacking, and in other cases they read logs to detect applicable exploits.
This comes to say that logs are often (sadly) a channel through which sensitive data leaks out. Sensitive data vary from operational secrets such as API keys and DB passwords, pass through private user data (regulated or not), and end with indications of exploitable vulnerabilities.
However, hackers are not true stake-holders of logs. The stake holders are the users, the developers(and the QA), the ops, and the business owners, as such — their requirement is to minimize the risk logs imply, if not eliminate it entirely.
After examining the stake-holders of logs and understood their position — we can boil them down to the following Use Cases:
- validate correct execution
This happens in QA (no matter who does it, developers or QA engineers), in version-evaluation phase (which is also a kind of QA), or even during development.
- monitoring a known KPI
Which could be technical, like requests per seconds, errors per hour, etc.
Or could be a business goals — like purchases completed, or registrations.
This happens during the version service time, and more and more becomes a standard requirement of the product.
- KPI research
When applied by Man or Machine, this happens as logs-data accumulate over time in volumes that can generate insights, often leveraging AI models. The goal here is to either discover valuable KPIs which can later be monitored, or find for a stated business goal some existing evidence in logs that can be reduced to a KPI that can be monitored.
This is the least interesting case because it happens in post-mortem, but we should still keep it in mind as a future thought.
- troubleshoot a root cause of a problem
This happens both during the version service-time, and in version-validation time. The interest here is usually concrete and specific even if it is observed on repeating cases.
- safe, secure and with minimal cost
All that with the cross-cut concern of doing it safely, without introducing vulnerabilities or betray private data. It’s a cross-cut concern just like the one that requires that all this should be done with minimal cost.
Professionally, all these are formal requirements your product could adopt and aspire to meet, just like any other functional requirement. I will go to more details on how in other posts.
The logs your software emits — is an elementary part of your product.
Be it the single interface your program uses like in a CLI tool, or a secondary “background noise” that is dumped and forgotten — these logs are accessible to a range of stake-holders and represent your attitude and professionality. The effectiveness of these logs have significant implications on the wellbeing of the entire team and the success of your endeavor.
As a team, you have to own it and groom it continuously like any other part of your product. Even when it’s not your main interface — it is a part of, well …your face.
This is the first of a series about application-logs and logging culture.
Next in the series: Misconceptions re. Application Logs