Live anomaly detection with 1-click machine learning inside the TICK stack
Loud ML is the first machine learning API natively integrated in the TICK stack. Other database integrations are becoming available every month, but in this tutorial, we’ll show how to start using 1-click machine learning right inside Chronograf, the time series monitoring platform from InfluxData. The video accompanying this tutorial uses live data.
In this tutorial, we’ll use data center monitoring for our sample data. Any other live data that produces a regular output pattern will work, such as telephony fraud detection and energy usage spikes. The video includes the basics, while the text below contains more detail for reference.
- You already have your own live data and that you’re familiar with the basics of Chronograf.
- To run your own 1-click ML tasks, you will need the TICK stack add-on, which is open source (link: https://hub.docker.com/r/loudml/chronograf/). The add-on supports all standard InfluxDB features, so there’s no learning curve.
1-click machine learning
The video below provides a quick overview of 1-click ML in the TICK stack:
1-click machine learning — detailed steps
To get started with 1-click ML, use the Data Explorer to select your metrics data. You will see the same top-down data selection here as in the original TICK stack, with a few extras.
First, select a database; we’ll choose metrics. Then, select a measurement. We’ll use the load average measurement for the CPU load. If you want to, you can also choose to filter by individual data centers and hosts, listed here, but this isn’t mandatory. We’ll filter by data centre ‘ams-1’ and by host ‘h1’ by simply clicking on the tags available:
With our data selected, we’re ready to choose a field and an aggregation function from the choice displayed on the drop-down list of options.
Choose the time interval that best suits your data, and the graph below will update automatically. Try different intervals to discover which data visualization best suits your needs.
Choose how to handle any gaps in previous time segments. The choices include:
Previous: Use the value from the previous time segment
Number: Use your own specified number (eg, 0)
Null: No filling
Linear: Linear interpolation of neighboring, non-missing values
We’ll choose ‘Previous’ to fill gaps in the data with the values from the previous time segment.
Update the amount of timed data you wish to train on. We’ll select six hours. The more data you train on, the more accurate your deep learning model will become. It will also increase training time.
Hovering on the 1-Click ML button will display a summary of the selections, including the data source, the aggregation function, the time interval, any filters on selected tags, and the fill value. It’s important to note that clicking on the button will start a new machine learning task.
From the Loud ML logo on the toolbar, view the list of machine learning tasks. You will see that a task is running. This works like autopilot AI for cars, where the AI learns the shape of other cars, pedestrians and other important objects. The Loud ML machine learning job is learning the shape of the data set we just selected. This learned shape will become the expected normal shape of any further data in the same time series.
Click on the job to see the full list of available options.
- The General tab contains information and overall options for your data source.
- The Parameters tab includes the seasonality tool, which allows you to fit time series data with weekday, and/or time of day repeating patterns, plus the bucket interval and more.
- The Features tab shows some of the settings chosen earlier, which can be adjust here, as well as some additional features. One feature is the Anomaly drop-down, where you can set the application to raise anomalies if the selected metric becomes too low only or too high only, or either too low or too high. But what is too high, and what is too low? Machine learning learns these values so you don’t need to set them.
- The Prediction tab provides controls for the prediction interval and offset times.
- The Anomaly tab contains the all-important annotation option, which when activated, will highlight anomalies in your live data as it arrives. Also in this tab, the threshold values can be overridden from 0 to 100.
Click on the Loud ML logo on the toolbar at any time to see the progress of training. We’ll fast forward a few minutes to see the results. Now is a good time to grab a drink if your training data is extensive.
The screen will update to confirm when training is completed. It will also show a loss value. Aim for a small loss value: a smaller value indicates more accurate training. A higher number might require additional training.
Like a fresh University graduate, our 1-click machine learning is ready to get to work. Let’s see the power of machine learning automation in action.
Viewing the new data visualization
When you hover over the job, the screen also contains a shortcut to the right of the status which will add data visualization to a new dashboard. Click on the shortcut, then from the dashboard, click on the visualization link to display the graph.
The graph shows actual data in light green, plus a normal range indicator in dark green. The graph provides all sorts of tools to gain better data insights. Just some, for example, include:
- Select any part of the graph to zoom in for a closer look. Your existing data is augmented with a floor value and a ceiling value, also visible in the legend which appears when you hover your mouse over your data.
- Change your time selection to see live auto-refreshes.
- Change the auto-refresh interval.
Remember, this is live data, so the graph will show whether the data is producing the expected range in dark green.
Since we chose to annotate anomalies earlier in our preferences, any anomalies in the data will appear on the data with two vertical lines indicating the start and the end of the anomaly. In our graph below, the data is too low and the green curve is way outside the normal range.
Hover the mouse over the anomaly to display a tooltip. The tooltip tells us when and why the anomaly occurred. In this example, the live data is too low. We can verify this because the tooltip also displays values in a legend which show that the data point isn’t within the expected range.
The graph will continue to update with live data. If live anomaly detection is no longer needed, it’s easy to stop. Click on the Loud ML logo on the toolbar, Then, click on the Stop button. Now, any time you’re in the Chronograf Data Explorer window, you can use machine learning in just one click. Simply click on the 1-click ML button.
Loud ML also provides long-term predictive maintenance, so you’re always a step ahead. We’ll cover that in the next tutorial.
If you’d like to try live monitoring with Loud ML for yourself, go to http://loudml.io to download the free Community edition and watch your data in real time.