Zen in the art of IoT
Or how to stay calm through the storm that turns out to be just a drizzle.
I’m currently expanding our monitor and notification systems. For long time now you’ve been able to set DevicePilot up to automatically monitor your devices’ data stream and trigger a notification — in the form of emails, Slack messages, Zendesk tickets and others — for abnormal situations. Say your device stopped sending data back to the cloud or its internal temperature has gone above a certain threshold. Those are things you might want to be made aware of.
Or maybe not. This simple approach works when you have hundreds of devices. Say you now have tens of thousands of devices per headcount of your operations department deployed all over the world. How do you make sure you don’t lose your Zen while keeping your service delivery standards? Let’s enumerate a few down-to-earth guiding principles.
1) First line support is to be done by machines. If a problem can be solved automatically by simply notifying another computer we will build the integration for you, in case we don’t support it yet.
2) Never notify a human if nothing can be done to remediate the issue because alarm fatigue will decrease your reactivity to actual problems. DevicePilot silences those 2am notifications for sites that are not open at night. Did you know you can now define business hours?
3) Only notify people that can actually do something to solve the problem. Filters have always been a strong concept in our application. You can use them to narrow down the scope of devices and alert only the person in charge for that smaller set.
4) Don’t notify until your service level has fallen below the agreed acceptable level. If a site has 4 devices and one is broken, does that really warrant a first class call-out? Well it depends — if the other 3 devices are currently in use maybe yes.
Riding on that last point, the emerging trend is that no connected device is an island. It is often part of a bigger deployment of similar colocated devices. As such, monitoring the data stream of individual devices in isolation breaks with scale. What the world needs is for DevicePilot to extend their advanced cohort analysis tool with an automated wing. We will run cohort for you at an appropriate cadence and send out a notification when your agreed service level is in danger of being breached. Not when a single petty device has decided to stop working in the middle of the night in the desolate countryside. Stay tuned.