Tech Blog: Create Meaningful Logging

Andrian Gandawida
5 min readAug 28, 2023

--

Bad things always happen in production. To fix them, we need monitoring service to help us understand what really happened. Logging is one of the most powerful tools to monitor your application, but be sure that your log is meaningful, otherwise it will just be an unusable text that slowly filling up your disk.

Meaningful Log will tell you Story

Your log need to be able to tell you what really happen in your application. Especially if it is an error. These are several questions your log need to answer:

json logging example

What is this about?

This is very important that your log is logging an actual message. Its already a common practice to always return an error message if your application/function/sdk failed to do something. Error message must be clear and describe what went wrong.

How bad is it?

If you are logging an error, your log must be able to tell you their severity level. Is it just info, warning, or error. This is used as a base to determine SLA or fixing time. I usually also include error type data such as:

  • Bad Request, error caused by client request. No need to worry.
  • Internal Server Error, error caused by our server. You need to find what causing this error since your client journey might be broken.
  • Fatal Error, error that make your server down. You need to fix this ASAP.

Where it happened?

If you are logging an error, you also need to know where that error is happened. What environment your server is running. If you run a cluster, you can also log your cluster name and IP. You also need to log entry-point of your log, if it happen in HTTP server apps, you can log your API Endpoint. This is important to understand whether your error is happened because of your apps logic or your machine.

To isolate the error even more, you can also log function trace (list of function name that this process is going through). And even its code line.

Can we reproduce?

To fix an issue and find the root cause of a bug, we also need to be able to reproduce the error. If your application is making a request to external service and causing error, you need to log the error message, as well as your request parameter to that external service. By knowing the request, we can then reproduce the same request in our local environment to debug further.

If you do multiple logging in one incoming request, you can also log your request ID. This is just an identifier to indicate that your log is coming from the same request. You can make random request ID for each request, or retrieve them from your client.

An Example Implementation

To better understanding about logging, I will try to explain it with my actual implementation using backend service written in golang.

My example is http server using golang native http, custom middlewares, and several layers of application logic. I will do logging whenever error happened.

logic flow: logging when error happen
  • For every incoming http request, context object is passed along the process. We also use common pattern of golang to always pass context in parameter and return reponse & error object.
func (u *Usecase) SomeFunction(ctx context.Context, request int64) (response, error)
  • Log middleware is where we actually execute log command. Request ID middleware is where we generate random request ID or retrieve them from incoming request. Http middleware is where we write http response & status. Layers of handler, usecase, and repository is where the logic of application reside
  • Under log middleware or initial middleware, we can store incoming to context, to later be calculated like this (1)
// saving current time to context
ctx = contextg.WithIncomingTime(ctx, time.Now())
  • Under Request ID middleware, we can store request_id inside context like this (2):
// get request_id from http header
requestID := r.Header.Get("X-Request-ID")
if requestID == "" {
requestID = uuid.NewString()
}
// store request_id to context
ctx = contextg.WithRequestID(ctx, requestID)
  • Under our logic functions, we can store function traces inside context like this (3)(4)(5):
// sending function name to context
ctx := contextg.WithTracer(ctx, contextg.FromFunction(u.SomeFunction))
  • If error happened, we can store many error details to error object that we returned to parent function like this (6):
if err != nil {
// using custom error object to hold more properties
return response, errorsg.WithOptions(err,
errorsg.WithType(ErrorTypeInternalServerError),
errorsg.WithRequest(request))
}
  • Under log middleware, we have access to http request object, context, and error object. So we get all data we need, construct, and sending them to our log service (7):
stdMetadata := make(map[string]interface{})

// basic
stdMetadata["env"] = s.envService.Get()
stdMetadata["process-time"] = s.getProcessingTime(ctx)

// http related logging
stdMetadata["method"] = r.Method
stdMetadata["endpoint"] = fmt.Sprintf("%s://%s%s %s\" ", s.getScheme(r), r.Host, r.RequestURI, r.Proto)
stdMetadata["status-code"] = wrw.Status()

// get metadata from ctx
ctxMetadata := contextg.GetMetadata(ctx)
for key, value := range ctxMetadata {
stdMetadata[key] = value
}

// get metadata from error
if err != nil {
errMetadata := errorsg.GetMetadata(err)
for key, value := range errMetadata {
stdMetadata[key] = value
}
}

msg := fmt.Sprintf("HTTP Request | %v %v | code %v | in %v ms", stdMetadata["method"], stdMetadata["endpoint"], stdMetadata["status-code"], stdMetadata["process-time"])
if err != nil {
s.logService.LogErrorWithMetadata(stdMetadata, msg)
}
  • Thus, we can have a lot of useful info in our log. Example:
{
"level":"error",
"environment":"production",
"hostgroup":"my-service-http",
"IP":"10.33.44.55",
"endpoint":"/v1/my-program/my-endpoint",
"method":"GET",
"status_code":500,
"processing_time":1234,
"request_id":"6847cb17-5b15-4536-a839-98e20d83f539",
"function_traces":"handler.MyEndpoint|usecase.UpdateMyEndpoint|Repository.GetExtraData",
"err.message":"sql insert error around program_id 12",
"err.type":"internal_server_error",
"err.request":{
"id":12,
"status":3
}
}

Conclusion

  • Create meaningful log by making sure that your log is capable to answer these questions: whats is this about? how bad is it? where it happen? and can we reproduce? by answering those question, your log can provide information and can be converted into your action plan.

Side Notes

  • No need to log everything. You can use different strategy to only log warning & error severity. Many log wrapper already include this log level. Too many log will filling up your disk very quickly, especially in high RPS. It also make your server doing extra if you do logrotate. Sending too many log in your log service discovery often means more expensive bills.
  • Be careful with sensitive data. When log a message, make sure that you’re not logging any sensitive data like user’s password or access token.
  • During maintaining your apps, sometimes you will end up with error that not catched by your logger. Your existing log cannot tell you what really happened. If so, sometimes we put temporary log just to catch the issue or confirming our hypothesis. But we cannot make this log run forever because it will flood our log service. In this case, we might need some dynamic log enabler, a config or feature flag that enabling us to choose what log severity to send, logging only in specific node, or in specific endpoint.

--

--