Building Logging System in Microservice Architecture with ELK Stack and Serilog .NET Core [Part 1]
Have you ever take 6 hours for debugging while only take 6 minutes for coding in order to resolve a bug?
Or, in a beautiful day, you go to the office, open your laptop, the first thing caught your eye is there are a dozen complaining emails from the customer that related to a broken feature last night. You test that function again, it works well! So, what was exactly going on with the system last night while you were sleeping?
There are only two out of so many situations that you might have to deal with when your system doesn’t implement Logging feature.
What makes logging become an essential part of every application?
Because without it, our application will become lost control. We would know that something is wrong or broken, but be unable to figure out exactly what (see picture 2), or at least not be able to do that without spending an awful lot of time searching for the problem (see picture 1). This wasted time can always be spent in better, more productive, and more strategic ways.
Logging in Microservice Architecture (MSA)
In recent years, the rising of MSA makes Logging becomes more important than ever. We can’t be denied that MSA offers us so many benefits like deploying independently, easy to scale up and down, ability to adopt different technology stacks and more! But using MSA is not easy at all — I don’t want to say that it’s quite difficult… not only in how they communicate but also in how to manage them. And they even get more complicated when one or more services fail. Although, we can know which service is failed by applying Health Check (you can find my article regarding implementing a Health Check system in MSA here: https://link.medium.com/2p6jP6vaO3), but why does it fail? or it failed under what circumstances?
I bet, you can not find the answer if you don’t have good and meaningful logging.
What do you need to build a meaningful logging system in MSA?
Note: A good/meaningful logging system is a system that everyone can use and understand. Don’t think that only developers need logging.
Below are items that have helped me when dealing with logging in MSA.
- Use a Unique Id to correlate Requests
In MSA, services interact with each other through an HTTP endpoint. End users only know about API Contract (Request/Response), and don’t know how exactly do services work.
For example, your application has “buy item” function that needs involving from 3 services as below:
“Buy service” will call “Inventory service” and “Shipping service”. Once the request chain is complete, “Buy service” might be able to respond to the end-user who initiated the request. Let’s say you already have a logging system that captures error logs for each service. If you find an error in “Inventory service”, it would be better if you know exactly whether the error was caused by “Buy service” or “Shipping service”. If the error is informative enough for you. But if that isn’t the case, the correct way to reproduce that error is to know all requests and services that involved. Once you implement Correlation Id, you only need to look for that ID in the logging system. And you will get all logs from services that were part of the main request to the system.
2. Centralize Logging data in one place
Bear in mind that, you could deploy your services on different servers but don’t do the same with logging.
Your application usually adds more features as time goes by. Go along with this, there are so many services will be created new (my project started with 12 services, and now we have 20). These services could be hosted on different servers. Let’s imagine, what will happen if you store logging on different servers? — you will have to access to each individual server to read logs, then trying to correlate problems. Instead, you have everything that you need in one dashboard by centralized logging data in one place. If would save your time so much.
3. Define the format for logging
Applying MSA allows you to use different technology stacks for each service. For example, you can use .Net Core for Buy service, Java for Shipping service and Python for Inventory service. However, it also impacts to log format of each service. It’s even more complicated as some logs need more fields than others.
Based on my experience, I’d like to suggest JSON as a standard format for logging data. JSON allows you to have multiple levels for your data so that, when necessary, you can get more semantic info in a single log event.
4. Log useful/meaningful data
With me, when I see the log I want to know everything! What? When? Where?… even Who? — don’t think that I need to know exactly which person causes the problem to blame them :) Because, contacting the right person also helps you to resolve issues quicker. You can log all the data that you get. However, let me give you some specific fields that I’ve used in my project. This might help you to figure out what you really need to log.
- When? — Time (with full date format): It doesn’t require using UTC format. But the timezone has to be the same for everyone that needs to look at the logs.
- What? — Stack errors: All exception objects should be passed to the logging system.
- Where? — Besides service name as we using MSA. We also need function name, class or file name where the error occurred. — Don’t guess anything, it might waste your time.
- Who? — The IP address of the client and user name if any. Make sure don’t use this information to blame your teammates :)
Bear in mind that, logging system is not only for developers. It’s also used by others (system admin, tester…) So, you should consider logging data that everyone can use and understand.
5. Consider storing Personally identifiable information (PII) of your end-users
Sometimes, you log requests from end-users that contain PII. Be careful, it might violate GDPR.
Logging approaches in MSA
There are two techniques for logging in MSA. Each service will implement the logging mechanism by itself and using one logging service for all services. Both of them have Good and Not Good points. — I’m using both these approaches in my project.
- Implement Logging in each service
With this approach, we can easily define the logging strategy/library for each service. For example, with service written by .NET we can use Log4Net while service written by Java we can use Log4J…
The problem with this approach is that it requires each service to implement its own logging methods. Not only is this redundant, but it also adds complexity and increases the difficulty of changing logging behaviour across multiple services.
2. Implement central Logging service
If you don’t want to implement logging in each service separately. You can consider implementing a central service for logging. This service will help you with processing, formatting and storing log data.
This approach might help to reduce the complexity of your application. However, you might get lost your log data if that service is down.
In this part, we quickly walked through the importance of logging, especially in MSA and how to implement a meaningful logging system. In the next part — part 2, I will show you step by step to setup Logging system in MSA using .NET Core, Serilog and ELK stack.