InfluxDB To Grafana : Visualizing Time Series Data in Real Time

Introduction to InfluxDB:

InfluxDB is an open source distributed time series database. As per Wikipedia, It is written in Go and optimized for fast, high-availability storage and retrieval of time series data in fields such as operations monitoring, application metrics, Internet of Things sensor data, and real-time analytics.

Times Series (TS) data :

TS Data is basically the data that you ask questions about over time (short periods of data as well as long periods of data). Classic examples of time series data includes:

  1. Stock trades and quotes over time in financial markets,
  2. Metrics data, example data coming from servers, or application performance data,
  3. User Analytics data,
  4. Sensor Data, example: measurements coming from physical sensors.

Why is a time series database needed at all ?

Some developers may debate that Instead of having a time series database, we could just have a MYSQL database and have a time column and create an index by that. Here are the reason why use time series database such as InfluxDB instead of a MYSQL database :

1. Scale : Let’s understand it with an example of DevOps. Let’s say we have 2000 servers, VMs, Containers or sensor units each doing 200 measurements per server and we are sampling them once every 10 seconds. This itself gives us a total of 3,456,000 distinct points (equal to approx. 3.5 Billion data points) per day. We cannot put it all in a MySQL table.

2. Sharding Data : Usually time series data is done at scale which means we have to build it in a distributed system and even if you are using a database like cassandra, you still have to figure out how to shard your data and you end up writing application level code to do that.

3. Data Retention : It’s very common in time series to have high precision data that is kept around for short period of time and lower precision data that is kept around for months or years and usually if you are using some other database you have write some application level code to do that.

4. Rollups and Aggregation : If you have high precision data every 10 seconds then you want to roll it up into lower precisions say 10 minutes or hourly views that you can keep around for long time, you usually have to write your code to do that.

InfluxDB has a number of features that takes care of all these above mentioned features for you automatically.

Features of InfluxDB:

1. SQL style query language: This makes it easier to use and to work with.

2. Retention Policies: InfluxDB uses retention policies to automatically handle your data retention periods, so you can one area of database where you can keep data for 7 days and you can have another area of database where you can keep data for say 6 months, another area where you can keep data for 2 years.

3. Continuous Queries: This features lets you make any query run as a continuous query which essentially means telling InfluxDB to run the query in background and compute it automatically. That way we can do rollups and aggregation using this feature.

4. HTTP API-2 endpoints: InfluxDB gives two simple endpoints where we specify the database name and the retention policy with which you are writing or reading in data.

Write EndPoint Syntax:

HTTP POST	/write?db=mydb&rp=foo

Read EndPoint Syntax:

HTTP GET 	/query?db=mydb&rp=foo&q=querystring

Introduction to Grafana:

Grafana is a data visualization tool that provides ways to create, explore and share data in easy to understand graphical representation. Its mainly used to visualize time series data. It supports Graphite, ElasticSearch, Prometheus, InfluxDB, OpenTSDB and KairosDB.

Let us now build a demo where in we have live stream of data coming in a torrent client which is then being inserted into influxDB at regular time intervals and that again is being analyzed in real time and visualized inside Grafana dashboard.

Steps to Visualize Time Series Data in Grafana from InfluxDB:

  1. Download a sample torrent file, I took the sample torrent file of ubuntu from the ubuntu downloads page. Ubuntu download link

2. Run deluge (download from http://deluge-torrent.org), It is a torrent client. Then we need to make deluge accessible via API , so click preferences.

Then, select plugin from left side bar, and enable WebUI.

Then select WebUI from left sidebar and check the enable web interface at port 8112.

This will enable using deluge in browser at http://localhost:8112 . Then add the torrent of Ubuntu we downloaded in step 1 into deluge.

3. Once we have added the torrent file, deluge will starts downloading the file.

4. We need to control the speed of download in order to observe the data in grafana later, so we will change the download speed to say a minimum (i.e. 5KB/s)

5. Open another terminal window and lets start InfluxDB. The command to run influxdb:

influxd -config /usr/local/etc/influxdb.conf

Next, we create a database in InfluxDB with name ‘deluge’. Here is the curl command for the same:

create database deluge

Another way to run commands in InfluxDB is by calling its REST api using curl:

curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE deluge"
  1. Next, we will clone this repository https://github.com/mvantassel/deluge2influx

There needs to be a quick change in the file deluge2influx.js. The function writeToInflux needs to be as written below :

function writeToInflux(seriesName, values, tags, callback) {
console.log("seriesName = ", seriesName);
console.log("values = ", values);
console.log("tags = ", tags);
    return influxClient.writePoints([{
measurement: seriesName,
fields: values,
tags: tags
}], callback)
}

The same database name along with the username and password to login to InfluxDB has to be specified in the file ‘deluge2influx.js’ :

const influxClient = new Influx.InfluxDB({
host: 'localhost',
port: '8086',
protocol: 'http',
username: 'root',
password: 'root',
database: 'deluge'
})

Also, we need to specify the setting to connect to deluge client in the following code in the same file:

const delugeConfig = {
host: process.env.DELUGE_HOST || 'localhost',
protocol: process.env.DELUGE_PROTOCOL || 'http',
port: process.env.DELUGE_PORT || 8112,
password: process.env.DELUGE_PASSWORD || 'deluge'
};

Once we have made the above mentioned update, run the app using:

node deluge2influx.js

Here is the screenshot of the terminal once the app is running.

7.What this app essentially does is reads the data being downloaded by deluge torrent client and dumps it into InfluxDB specified database. We can see in the below screenshot that is happening by regular POST commands being called to dump data into InfluxDB.

8. Once the app is executing, we can go and check into InfluxDB whether data is actually coming into the database from deluge.

The above screenshot shows that data is coming into deluge database and getting stored in the tables named new_torrent and new_torrents.

9. Setup Grafana:

Now let us move on to visualise the data that is being dumped into InfluxDB at regular intervals. Download Grafana and set it up as mentioned on its website. link: http://docs.grafana.org/installation/

To run grafana as a service , run this command:

brew services start grafana

You should have Grafana up and running at http://localhost:3000/login

10. We need to create a new dashboard and add new user to it.

11. we need to first install InfluxDB plugin in Grafana.

12. We then need to add a SQL query to read data from InfluxDB at regular intervals. We can see the format of query in the screenshot below.

13. Then update the datasource details to match the InfluxDB database name as shown in the screenshot below.

Enter the login credentials for InfluxDB in dashboard.

14. We then need to specify the query in the metric tab of dashboard.

Choose ‘distinct’ as the aggregation parameter for field ‘downloaded

15. Also specify the data-source created earlier.

You may want to check the settings of data-source just to be sure it is the same we entered in the dashboard.

16. Once everything is setup properly, we should see the visualization appear in the dashboard as shown in the below screenshot.

These are some of the screenshots of how visualized data looks like. It keep pulling data out of InfluxDB in regular interval of 1 minute as specified while building the query in dashboard.

Hope this tutorial helps you get a better understanding of InfluxDB, Grafana and Time Series data.