Chronograf’s UI

Monitoring with InfluxData’s TICK stack

Travis Jeffery
The Hoard

--

TL;DR: Look into TICK if you feel the alternatives are too expensive, lack features, have bad clients, you’d like control over managing your data, or you’re simply interested in a promising tool for time-series data.

If you want a complete setup of TICK, follow this guide I’ve written.

Why choose TICK for your monitoring:

  • Cheap and open-source — Your only cost is the hardware you run it on (InfluxDB’s hardware sizing guideline) and the time you put into setting it up.
  • Works with StatsD clients — If you’re using StatsD, you just need to switch the host your StatsD client points at.
  • Soup-to-nuts monitoring — You can collect, store, visualize, and alert on your data with a stack designed to work together. You don’t struggle with integrating services built by separate teams.
  • Time-series data, not just metrics — TICK is made for time-series data, which includes but isn’t limited to metrics. I’ve faced many types of time-series data the past few years working in the Big Data space (monitoring, analytics, financial, sensors, anomaly detection, other time-series data use cases), and felt the tools were lacking, so hopefully TICK proves to be a useful tool in the toolbox.
TICK stack and service connections
  • Telegraf integrates and pulls stats from all kinds of servicesStatsD, Redis, Elasticsearch, PostgreSQL, and more. Telegraf makes light of the work to pull metrics from all your infrastructure into one place.
  • Chronograf’s design — It looks nice, clean, and is easy to use.
  • Chronograf’s SQL-like query language — No clicking through a bunch of long menus to setup your queries and visualizations.
  • Kapacitor’s TICKscript — You can write your alerts (or any type of event) in declarative code. Pretty powerful stuff. Kapacitor supports sending events to services like Slack, PagerDuty, webhooks, and more.

Here’s an example, the deadman(100.0, 100s) means to alert whoever’s posting this metric must be dead because it’s found less than 100 data points in the past 100s, and if its dead, we trigger an incident in Pagerduty and message Slack.

stream
|from()
.measurement('events_worker_processed')
|deadman(100.0, 10s)
|alert()
.id('Events worker/{{ index .Tags "host" }}')
.message('{{ .ID }} is {{ .Level}} value: {{ index .Fields "value" }}')
.pagerduty()
.slack()
Alert in Slack from Kapacitor.

Thanks for reading, hope this was helpful. If you have any questions feel free to hit me up on Twitter and keep up with what I’ve learned building my personal finance tool, Stash.

--

--

Travis Jeffery
The Hoard

Working on Kafka/Confluent. Made software at Basecamp, Segment. Writing open-source software https://github.com/travisjeffery.