How To — Healthchecks.io & PagerDuty (and more…) Alerts for Shipchain Validators

Shipmate.FR.nl
Shipchain (un) Official Community
5 min readOct 11, 2020

In our previous article, we explained how a Dead Man Snitch is the best way to be alerted when things are not going as planned. However some functionalities used in that tutorial were unfortunately not free.

Well, good news! With healthchecks.io, you can have the same functionalities (and more!)… for free!

Note: I will not repeat here any of the conceptual explanations shared earlier. Please read the article above for more details.

Let’s get started…

STEP 1: Sign-up to Healthchecks.io

  • No rocket science here, follow the link above and sign-up to healthchecks.io

STEP 2: Create a New Snitch and pick-up its URL

  • At your first log in, you will notice that your first Snitch (here called a “Check”) is already created, ready to be customized to your exact needs.
  • Name: especially useful if you have more than one Check.
  • Ping URL: you will need to copy this url in your cronjob (see step 3)
  • Integrations: please refer to step 4 for more details.
  • Last Ping: will display the last time your Snitch checked in (confirming everything was ok… so far!).
  • Period: Healthchecks.io is that amazing that on top of being free, it’s actually also more flexible than DMS since you can set up both Period and Grace Time with a 1 minute granularity (said differently, you can set up a Check which will check-in every minute, and raise an alert as soon as 1 check-in was missed). You can see below my own settings. Pinging every minute but raising an alert only after 10mn, which is good enough to avoid being woken up in the middle of the night for a simple software reboot.

STEP 3: Set up the Cronjob

  • The point of this step is to set up an automatic task running every minutes on your server, who will send a sign of life to your Healthchecks.io Check. As long as it does, everything is fine. When it stops, then the alert mechanism described above will be activated.
  • You are free to decide what is the condition for the cronjob to send this sign of life. In this particular example, we are picking up whether the Shipchain node is currently caught up. If it isn’t, then it is not validating blocks either, so you are at risk of being slashed and you would like to be warned. Of course, if the command returns nothing (eg. the service of the node is down) or if the cronjob doesn’t even work (eg. the server itself is down), then you would also like to be alerted.
sudo apt-get install jq
crontab -e
  • Do note that the first row of the cronjob needs to be looking like the one below (you can add as many paths as you like but make sure each ones are separated by ‘:’)
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/snap/bin
  • Please note the content of this row will differ for every users. You can fortunately check the PATH above which is correct for your own installation.
type hydra
  • This command should return something like “hydra is hashed (/usr/local/bin/hydra)”. In that case you needed to have at least PATH=/usr/local/bin in the first row of the cronjob.
  • Then the next row of the cronjob should look like this (replace the URL by the Unique Snitch URL you noted down earlier)
*/1 * * * * hydra -o json client status | jq .is_caught_up | grep true && curl https://hc-ping.com/81dd............................

STEP 3: Relay Healthchecks.io alerts into PagerDuty (and more)

  • You should get in love with healthchecks.io right after you clicked on the banner called “Integrations” and scanned through the list below.
  • Yes! You can set up alerts to all these channels (still for free, atlhough some alerts will have a maximum of notifications per month).
  • Personally, I receive alerts on SMS (useful if you are in zone without Internet), Discord (to communicate alerts to other people as well), Email , PagerDuty, Whatsapp and Telegram… so the only times I could get more simulatenous notifications is for my birthday or if I went on TV!
  • Last but not least, healthchecks.io created a small tutorial with pictures for each of these channels (see the one below for PagerDuty). So we have nothing else to cover here that wasn’t either covered in our previous articles or included into their mini tutorials. Great product really!

STEP 4: Wake up in the middle of the night! (again)

  • Thanks to these alerts, you will know your server is not behaving as planned, giving you the opportunity to log in and check.

This article marks the end of a first serie meant to ensure any (every?) Shipchain validator nodes can be alerted in time to perform corrective actions before being slashed. Network’s security is at stake, this is a big deal.

Should you have any questions, observations or suggestions of another security measure, feel free to reach out to us on one of the Telegram channels below.

Thanks for reading!

shipmateFRnl

--

--