Let’s admit we all are used to perform some kind of patterned/repetitive-manual task in our day to day work. So why not take a step back and reflect on do it in a fun way?
Our journey started with our Platform moving towards cloud. Transitioning from an OnPrem setup onto Cloud does bring some challenges for the Operations Team. Some of them include: 24x7 Infrastructure up-time, Proactive Monitoring & Effective Reporting.
As a small growing team, we had big responsibilities of updating and adopting ourselves to cutting edge cloud technologies and be self-reliant on handling the cloud infrastructure.
Initially we equipped ourselves with a licensed software: New Relic for monitoring the infrastructure. But let’s agree, this does not come cheap. We had to replace it. We then introduced a popular open-source monitoring stack: Prometheus + Grafana. On getting IP&S approval and validating these tools with our QMS process, we stepped up our monitoring game. Utilizing the available exporters for Prometheus, we altered them according to our needs which gave the team confidence that we are ready to monitor. …
To any Operations Team, Key Performance Indicators (KPIs) play an essential role in keeping the applications running and stakeholders happy.
There were a lot of metrics and reports which flow along timely-mails and in-house dashboards. But what was really lacking is: ‘Intelligence’. We needed to empower and enable the user view these reports to predict, analyze trends and draw conclusions. We came across a rich open source tool from Apache: Superset. It still being in its incubating phase, it does offer a solution to what we were looking at. For folks looking a dockerized version, it is available too.
We then started evaluating on how well we can fit in Superset to our day-to-day operational activities. …