What does the significance of Centralised Logging for your Organisation?

Purpose of logging

Logging each transaction of your application will give more visibility to you to understand your application on large canvas.

It helps you to explain and understand that how does your application interacting with internal and external systems or modules or services.

It makes developer’s life easy to debug the issue in case of unknown software crises like third party service is crashed, database server is overloaded or running slow and any unknown activity.

Traditional approach of collecting transnational activity logs

Writing a activity logs to database table! Ahh, that’s not correct approach anymore and it become expensive in terms resources cost and user experience cost by utilising same database resource to generate the complex reports.

Working with traditional code

Being a software developer, there will be time when you must need to work on the old outdated traditional code (without any given choice) and get your hand dirty in order to keep your system running.

There might be plenty of excuses provided by either stakeholders or product owners / developers to not refactor the code. Let’s keep that aside!

But why we are talking about old outdated traditional code?

It is very complex and untouched code over the decades and nobody wants to take ownership of it. That’s make things very difficult for any developer to understand it and to introduce new changes into it. But adding a sensible logging to the existing code will be good practice to resolve the mystery of traditional code and after getting understanding from the logs, you can take a decision like refactoring or redesign the functional workflow.

Now, let’s talk about Clean Code with documentation.

It definitely give a boost to any developer to accelerate development efforts in minimal time for given requirements. But if it has proper logging in place, then it makes things lightning fast for developer as it is very simple to understand and visualise function workflow without looking into the code.

What’s up with Cloud Computing Gold Rush?

As we are witnessing that how does tech companies are moved from managing own server machines to cloud services over the past decade.

Now, situation is everyone is demanding everything required by software application is need to be on cloud as service. There are plenty of examples such AWS Artificial Intelligent services

Isn’t it awesome to do high level computing just through the API call?

Yes. It is .

Apart from that most of tech companies are also started to adopting the service oriented architecture where display layer is completely separated from business logic. The only way to communicate with database is data oriented or business oriented RESTful services which takes care of business logic and data cleaning.

Although, building a micro-services and decoupling everything is help you to minimise the complexity of your application, but in opposite side I feel we are also increasing the number of layers and failure points. And by monitoring them via logs, we need to make sure that those layers are communicating/connected with each other very efficiently and effectively.

Wait! Serverless computing!!! Huh? I did not see that coming!!!

This is another cloud computing domain coming into the limelight in last couple of years. The cloud provider will allows you to write a micro service and run it without having actual running server instance on the cloud (Cloud Provider take cares of all that under the hood).

That’s amazing, right! There are no worries related to scaling in and down maintenance work, but developer need to make sure that micro service’s intended behaviour must executed efficiently without any problem.

And logging will give that confidence to you. How? we will see that soon.

Why logging is becoming so important?

Answer is fairly simple, debugging a production issue quickly and act on it with lightning fast response before it make any troublesome to end users.

It is competitive market, if you release a bugs on production server and it started to noticeable to your end users, then it’s become difficult to regain the trust of those unhappy end users and retain them. As competitors will start building up the strategy around those bugs or issues and try to capture those unhappy end users.

Logging, monitoring and alerting is prominently help your organisation to survive in competitive market and allow you to lead it as well.

Before doing any production release (deploying new changes on the running server), we do the hell of testing to make sure that

  • There is no security breach
  • Function flow of new changes is working as expected
  • There is no repercussion of newly introduced changes etc.

But these are the prevention which are needed to be taken by developer and QA team by writing more elegant test cases to cover all possible scenarios.

Yeah, you read it right possible is the word. It never going to happen that developer will cover up the all failure cases.

There are plenty of chances that we miss the number of edge cases during the envisioning of requirement, e.g.

  • Lack of knowledge in business case or new experimenting area
  • Lack of knowledge in new technology stack or new service
  • No proper understanding of existing architecture

How does logging going to help us?

It is pretty straightforward that adding needful log line to your application and maintaining state of transaction in logs will help you to understand journey of event or user request.

Let’s take simple example, i.e. Resume Rating Application

This application will help you to rate resumes by sorting them into categories. The application will use an AWS Lambda micro service to compute the score of the upload resume and build a database using retrieved attributes of the resume.

Here is brief description of application workflow.

  1. Using website interface, we will allow user to upload a resume.
  2. Application server will store it on AWS S3 bucket
  3. AWS S3 bucket will trigger AWS Lambda function (micro service) which will do needful computation by applying the business rules and logic
  4. Outcome of AWS Lambda function will be store in database

In the given workflow, logging will play following important role

  1. Logging user activities and interactions on website interface will allow you to understand how does user reacting with your website
  2. Application Server logs will help you to understand application performance
  3. Logging a micro service logic stages will help you to understand which business rules or logic is more effectively and efficiently contributing the significant value to calculate score of the uploaded resume.
  4. Logging database operations will help you to monitor the performance of the database by easily identifying the bottlenecks of database slowness

Note: I’m not elaborating the above logging use cases in more details, but it will give you little bit sense of usefulness of logging.

Now imagine, if you are able to see the journey of single resume upload transaction by forming required visualisation on top the available logs.

It will give clarity of how does that single journey making impact of all components of architecture.

You can easily find out the performance of application and database server during the peak time and make needful decision accordingly.

There are more benefits of logging when you form the multiple visualisations and see the logging data from various angles.

Structure or Schema of log line

Most important part of logging is having well formatted and well structure log line which is easily understandable for human brain.

If you ever seen Apache access or error logs, those log lines are pretty straightforward and self explanatory with proper documentation.

Because of that, it become very easy to transform that data to any other data format and form the visualisation out of it.

It is very important to define the guidelines for the logging. Each developer must follow that guidelines before adding single log line into the application code.

Without the discipline, logging will be useless and you won’t able to make any sense out of the logging data.

“If you’re good at the debugger it means you spent a lot of time debugging. I don’t want you to be good at the debugger.”
“Clean code is not written by following a set of rules. You don’t become a software craftsman by learning a list of heuristics. Professionalism and craftsmanship come from values that drive disciplines.” 
Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

ELK Stack (Elasticsearch, Logstash and Kibana)

Logstash will allow you to pull out the server logs or any real time logs (via. udp/tcp connection) using various available input plugins and put the all output logs to elasticsearch (via. output plugins).

Kibana is powerful tool to transform your logs data into the sensible business metrics or statistics visualisation by running a query on the elasticsearch.

Do you want to try hands on the ELK stack?

Yes. Go ahead and start experimenting with Earthquake data with the Elastic Stack.

Here are the great example of usage of ELK stack

  1. Operational Data Analytics with Elasticsearch, Elastic Stack (ELK Stack)
  2. Centralized Logging with Elasticsearch at Collector Bank
  3. And lot more coming… Elasticsearch SQL

Why am I big fan of Elasticsearch?

Well, because of X-Pack. With the latest release (Elastic Stack 6.4.0), they are offering following set of features under the single roof.

  1. Security — Protect your Elasticsearch data in a robust and granular way.
  2. Alerting — Get notifications about changes in your data.
  3. Monitoring — Maintain a pulse on your Elastic Stack to keep it firing on all cylinders.
  4. Reporting — Create and share reports of your Kibana charts and dashboards.
  5. Graph — Explore meaningful relationships in your data.
  6. Machine Learning — Automate anomaly detection on your Elasticsearch data.
  7. Elasticsearch SQL — Query Elasticsearch using the SQL syntax.

Because of mentioned set of features makes elasticsearch my first choice and personal favourite to monitor the all type of transaction logs, anomalies in the business metrics or trends and different types of application performance under the single umbrella.

I don’t have to rely on different type of services or tools to check my transaction logs and application performance metrics. Then mapped logs events with incident time and predict the root cause of the issue. Even after that I need to identify affected audience because of the issue. It quite difficult to explain these things to stakeholders by mapping all dots and documenting whole progress report on the issue.

But as you seen in above shared examples, you can easily correlate the visualisation and time series visualisation data with each others. It also allow user to check underlying logs for given time frame to confirm the existence of the issue and to identify the affected end users.

Using machine learning feature, you can train the models using historical data in order to identify the anomalies in past month or week. Using that trained models, it allows you to identify the anomalies in real time data and predict the future trend as well. Using identified historic anomalies and trend patterns, you can also setup the alerts to make sure that your application is in healthy status and in the case of failure you can directly start working problem area instead of debugging the things.

Apart from monitoring and alerting, you can use machine learning to generate more sensible data out of logging data to make a business decision as well.

Turning the database inside out

It is little bit off the topic, but still want to show case you that how can you build schema less object and association oriented database out of the logs or events. For more details, you can click on the heading for actual blog. Here is the most important line from that blog.

In a relational database, the table is the core abstraction and the log is an implementation detail. In an event-centric world with the database is turned inside out, the core abstraction is not the table; it is the log.

Summary

Logging is nothing but a blood to your software application or system, it need to keep flowing with right amount of attributes and standard format and in return, it will give health report cards of your software application or system.

From stakeholders point of view,

Logging is another way of understanding your customer and your application.

From developer point of view,

Logging is an art which allows developer to express his/her thinking about the part of the application [on which he/she is working].

P.S. Motivation for this blog is I was looking for single place to monitor all types of metrics and correlate them with each other.