Elastic Search — One stop shop for Analytics

Manan Mehta
3 min readFeb 10, 2020

--

If you can’t measure it, you can’t improve it. — Peter Drucker

Sooner or later we realize that we need to measure our product: to improve it, to understand the usage, to find anomalies or just to brag about your work to your boss.

Our team was in infancy and soon we realized that we need a common analytics platform to make business decisions.

  • We wanted to land new servers at a site where usage was the most.
  • Identify top users and equip them with best hardware.
  • Track execution times and failure rates of various components.
  • Move the most active contents to faster hardware.

All of the above scenarios needed data to make the decisions.

You must have heard about tools such as Elastic Search and Splunk. This post gives you an overview of how to tie together data from various kinds of inputs and form a data lake for multi purpose tracking for your organization. We used Elastic Search for our use case. I believe similar functionality exist in Splunk as well.

We were trying to track information from database, text log files of application servers, data getting logged from Asp.Net C# components and some execution results from various cmd, Power shell and Python scripts. Below is an overview of each input kind.

Database Tables: Syncing database content to Elastic Search in near real time. This would allow us to build dashboards on database contents with minimal efforts. We used the blog as base with few modifications to achieve this for MS SQL database.

Content from Server Log files: We needed to extract information about who is using our application servers and for what. I have already written a detailed blog about this. It explains how to extract specific information out of log files using Grok filters.

Application Logs: We needed to trace information from logs generated from our own software components written in Asp.Net C#. We could have used (2) for this as the logs are being written to log files as well. However, the Serilog Elastic Search sink provides great flexibility to insert logs with custom fields in Elastic Search with minimal effort.

Ad Hoc: As we started seeing benefits of analytics, it opened up flood gates of use cases. Developers started coming in with new use cases every day. Many of them were in various scattered scripts that we deploy to our as well as end user’s environments. We did not want to burden them with Elastic Search Web Api to insert content.

To achieve this, we developed a Web API based application (Analytics Service) that abstracts Elastic Search interactions and inserts Json content to Elastic Search in the form we want. We created a metadata section for mandatory fields. The value of category field would act as a namespace and creates optional fields in Elastic Search with category field value as a prefix.

Below is a sample input

Below is the content in Kibana (visualization tool for Elastic Search). Notice the name of the fields prefixed with “vehicle_registration” which was supplied as a category value. This allows us to log data from various categories in single Elastic Search index.

We are sure, as we move forward, the input types to Elastic Search will keep on growing. Logging from a message queue is already on the horizon.

Once you start tracking the data you ask yourself, “Why did I not start tracking earlier ? “

--

--