Artjom Nemiro
3 min readAug 15, 2018
Visual monitoring on TV screen

I decided to share my experience, about how valuable can be visual monitoring, so I hope you will able grab some ideas.

The source
Not a long time ago, I started work as the back-end developer for one fashion company based in Sweden. And they have a complex infrastructure and systems. So you can spend a day and night to understand how it works even if it’s a small part of the whole picture. That time I didn’t have, so I decided to research, API methods, logs, and different monitoring data. Also in between, I should do my responsibilities, crazy, I was not sure if I understand the system for which I responsible now. So, I decided, I need a map, a tool where will be visible all infrastructure flow and if something went wrong I can see that. I remembered about the conference where a Netflix showed how they do microservices and how they monitor that universe. So, I thought, maybe I can apply similar ideas to monolithic systems and I started prototyping, testing and adapting that to company infrastructure.

Proof of concept
Humans are lazy, and visualization can win here, it’s always easy to see once instead of seeking information in different resources. Actually, sometimes it gives you instant vectors to escalate an issue.

Here some examples where that can help:

* Data updates. We have an e-commerce shop and all content managed in a CMS system, but for example, someone made a mistake in the product updates. In the end, data will be pushed to services and shop. In same time customers began work with that updates and we can observe chain reaction and escalate the source of the issue by watching error comes from here, notices placed on that node, confirming that in trace logs and delivering the solution.

* Code updates. If somehow with the update was delivered bug or that update affects and breaks others systems, then sometimes we can observe that faster then triggers will send a message to us. Visual nodes become red, warnings flashing, alarm sounding and traffic reduces or even stops and we rushing to resolve that.

* Infrastructure maintaining. With that visualization, we can easily observe how traffic balanced between instances and if something balanced not well, we can see that. Database slow connections, cache problems all can be covered with visualization and you can observe that.

The monitoring tool
I’ll explain my pet project (https://github.com/LinMAD/BitAccretion) which works in the current company and serves us well.

Front-end
Thanks to Netflix they shared source code (https://github.com/Netflix/vizceral) and now we can easily draw that avoiding of inventing own wheels. They have documentation and you can write own aggregator to feed it with your data.

Back-end
A prototype is written with Go and plugins, so I can implement additional functional or change the source of monitoring data.
Currently, we using New Relic for collecting metrics and notices, so it’s my source where to retrieve needed data. Previously I used Nginx logs from different servers in Logentries to aggregate requests, errors rates, and types, but not all of the systems were there and I switched to New Relic.

Conclusion
Visual monitoring like Netflix showed that can be cool but as the additional tool. It’s nice to have it when you covered systems with metrics collection and it’s time to do something more. Also, that can help for new people in your team to understand quickly how infrastructure interacts with each other.