Logging at correct level
Logging seems so trivial but is generally done in a very haphazard manner. Logs can act like a novel which explain what is happening at a very detailed level, if we change the log level, they can act as a small story in the children’s book. It can also act like a new bulletin if we change the log level further. Depending on our needs we should be able to change the level and get a snap shot of what is happening in the system and for this few things are necessary for the logs:
1. Logged at correct level
2. What is happening
3. Why it is happening
Logging at correct level
DEBUG
, INFO
, ERROR
, SEVERE
, etc. I have seen people don't think much about the level and follow a simple rule, if they want to log something they put in in INFO
, if it's because of an exception then it get's logged as ERROR
. This approach is flawed. It doesn't give you a perspective at different level about what is happening in the system. Log levels doesn't have to be the one designed by the platform. You can have your own levels and almost all platforms provide a way to configure the levels.
SEVERE
is a major breakdown in the system. At this stage the system or a sub-system cannot function properly. This is in irrecoverable error. SEVERE
logs are your incidents. You have to fix it right now. PagerDuty alerts should be triggered and people should woken up at 3 am in the morning.
ERROR
is a recoverable problem. A connection to database could not be made - no problem, let's try it again. It's obvious from the last statement that ERROR
have a potential to become SEVERE
, not always, but some of them they may. You should address them on priority. Don't wake up people 3 am, but take care of them as soon as possible. The question then rises, should you log exceptions as ERROR
? Well, it depends. A MalformedURLException
caused by a system property could be an ERROR
but the same exception on the data supplied by the user will not be an ERROR
.
ERROR
is a system level exception. It's not a request level exception. Some example of ERROR
- System property is missing or incorrect and fallback is not present
- Precondition (assertion) for a method are not met, not there is a possible bug in the code
- Services that are unavailable but retry can be made (these may soon become
SEVERE
if retries fail)
WARNING
has a fallback. It's a FYI message. You can choose not to act on it, but make a note of it. They look something like this:
- Could not find the path for backup. Using the default directory of the system
- Failed to parse the property. Checking in database.
- Could not export content of user. Skipping. (this of course depends on if you can safely skip)
INFO
messages are your stories. They tell you exactly what the system is doing. They tell you what path the system took to achieve certain result. There are the milestones in the system. Don't confuse them with tracing the method calls. A story becomes boring if you say:
The fox took three steps ahead, took a left turn moved few more steps, then turned right walked few more steps, then dipped it’s face in water.
It’s interesting when you just say the fox drank some water when you just want to know what’s happening. That’s a milestone.
DEBUG
and TRACE
should be used to log method parameters, trace exact method calls. These are useful when you are not sure why the system is behaving in a certain way. And that's why of course the name: DEBUG
. INFO
is giving you a snapshot of the system. You don't really care how it reached there because you know how it reached there. Use DEBUG
and TRACE
when you are lost.
TRACE
or FINE
I'll say is more finer level of detail, you can add method calls being made, parameters being passed. Reserve DEBUG
only for more information on INFO
message - perhaps what input is passed or what is the current value any property.
Levels are in a way conventions followed. They may be different across organizations, but having them consistent across organization is important otherwise your logs will be a mess.