How To Use Elasticsearch and Kibana to Visualize Data

Published in

Qbox Search as a Service

10 min readJan 25, 2019

“In this tutorial, we’ll show how to create data visualizations with Kibana, a part of ELK stack that makes it easy to search, view, and interact with data stored in Elasticsearch indices.”

Elasticsearch mappings allow storing your data in formats that can be easily translated into meaningful visualizations capturing multiple complex relationships in your data. In this tutorial, we’ll show how to create data visualizations with Kibana, a part of ELK stack that makes it easy to search, view, and interact with data stored in Elasticsearch indices. We’ll walk you through basic data visualization types including line charts, area charts, pie charts, and time series, after which you’ll be ready to design a custom visualization of any complexity.

Get Data

For this tutorial, we’ll be using data supplied by Metricbeat, a light shipper that can be installed on your server to periodically collect metrics from the OS and various services running on the server. Metricbeat takes the metrics and sends them to the output you specify — in our case, to a Qbox-hosted Elasticsearch cluster. Metricbeat currently supports system statistics and a wide variety of metrics from popular software like MongoDB, Apache, Redis, MySQL, and many more. Data from these services includes diverse fields and parameters that make Metricbeat a great tool for illustrating the power of Kibana data visualization.

To start using Metricbeat data, you need to install and configure the following software:

Elasticsearch for storage and indexing of data. Follow this installation guide to install Elasticsearch. For this tutorial, we’re using a Qbox-hosted Elasticsearch cluster.
Kibana. A Qbox-hosted Elasticsearch cluster ships with Kibana. When provisioning your cluster, just specify that you want to install it.

To install Metricbeat with a deb package on the Linux system, run the following commands:

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-6.2.3-amd64.deb 
sudo dpkg -i metricbeat-6.2.3-amd64.deb

Before using Metricbeat, configure the shipper in the metricbeat.yml file usually located in the/etc/metricbeat/ folder on Linux distributions. In the configuration file, you at least need to specify Kibana's and Elasticsearch's hosts to which we want to send our data and attach modules from which we want Metricbeat to collect data. See Metricbeat documentation for more details about configuration.

Once all configuration edits are made, start the Metricbeat service with the following command:

sudo service metricbeat start

Metricbeat will start periodically collecting and shipping data about your system and services to Elasticsearch. Now, you can use Kibana to display this data, but before being able to do so, you must add a metricbeat- index pattern to your Kibana management panel.

After this is done, you’ll see the following index template with a list of fields sent by Metricbeat to your Elasticsearch instance.

That’s it! You can now visualize Metricbeat data using rich Kibana’s visualization features.

Kibana’s Visualize Module

Kibana visualizations use Elasticsearch documents and their respective fields as inputs and Elasticsearch aggregations and metrics as utility functions to extract and process that data. Kibana supports numerous visualization types, including time series with Timelion and Visual Builder, various basic charts (e.g., area charts, heat maps, horizontal bar charts, line charts, and pie charts), tables, gauges, coordinate and region maps and tag clouds, to name a few. All available visualization types can be accessed under Visualize section of the Kibana dashboard.

Although the steps needed to create a visualization might differ depending on the visualization you want to produce, you should know basic definitions, metrics, and aggregations applied in most visualization types.

The first step to create a standard Kibana visualization like a line chart or bar chart is to select a metric that defines a value axis (usually a Y-axis). Kibana supports a number of Elasticsearch aggregations to represent your data in this axis:

Count: Returns a count of the elements in the selected index pattern
Average: Returns the average of a numeric field selected in the drop-down
Min: Returns the minimum value of a numeric field selected in the drop-down
Max: Returns the maximum value of a numeric field selected in the drop-down
Unique Count: This cardinality aggregation returns the number of unique values of a field
Standard Deviation: Returns the standard deviation of data in a numeric field

These are just several parent aggregations available. For more metrics and aggregations consult Kibana documentation.

Kibana also supports the bucket aggregations that create buckets of documents from your index based on certain criteria (e.g range). This information is usually displayed above the X-axis of your chart, which is normally the buckets axis. The X-axis supports the following aggregations for which you may find additional information in the Elasticsearch documentation:

Date Histogram: This aggregation is built from a numeric field and is organized by date. It allows specifying intervals for your historical data or design custom intervals.
Histogram: Builds a standard histogram from a numeric field. You need to specify an integer interval for this field.
Range: Range aggregation allows specifying ranges of values for a numeric field.
IPv4 Range: Allows specifying ranges of IPv4 addresses.
Terms: With a terms aggregation, you can specify the top or bottom n elements of a field to display ordered by count or any other custom metric
Filters: Kibana supports filters to specify rules for querying your Elasticsearch documents.

After you specify aggregations for the X-axis, you can add sub-aggregations that refine the visualization. With this option, you can create charts with multiple buckets and aggregations of data. After all metrics and aggregations are defined, you can also customize the chart using custom labels, colors, and other useful features.

Making Time Series with Timelion

Timelion is the time series composer for Kibana that allows combining totally independent data sources in a single visualization using chainable functions. Timelion uses a simple expression language that allows retrieving time series data, making complex calculations and chaining additional visualizations.

In the example below, we combined a time series of the average CPU time spent in kernel space (system.cpu.system.pct) during the specified period of time with the same metric taken with a 20-minute offset. The expression below chains two .es() functions that define the ES index from which to retrieve data, a time field to use for your time series, a field to which to apply your metric (system.cpu.system.pct), and an offset value. Chaining these two functions allows visualizing dynamics of the CPU usage over time.

.es(index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.system.pct'), .es(offset=-20m,index=metricbeat-*, timefield='@timestamp', metric='avg:system.cpu.system.pct')

We suggest that you experiment with Timelion by doing similar comparisons for the percentage of the CPU time spent in user space, for low-priority processes, being idle and using numerous other metrics shipped by your Metricbeat instance.

Visual Builder

A powerful alternative to Timelion for building time series visualization is the Visual Builder recently added to Kibana as a native module. Similarly to Timelion, Time Series Visual Builder enables you to combine multiple aggregations and pipeline them to display complex data in a meaningful way. However, with Visual Builder, you can use simple UI to define metrics and aggregations instead of chaining functions manually as in Timelion.

In the example below, we combine six time series that display the CPU usage in various spaces including user space, kernel space, CPU time spent on low-priority processes, time spent on handling hardware and software interrupts, and percentage of time spent in wait (on disk). To produce time series for each parameter, we define a metric that includes an aggregation type (e.g., average) and the field name (e.g., system.cpu.user.pct) for that parameter. For each metric, we can also specify a label to make our time series visualization more readable. With the Visual Builder, you can even create annotations that will attach additional data sources like system messages emitted at specific intervals to our Time Series visualization.

In addition to time series visualizations, Visual Builder supports other visualization types such as Metric, Top N, Gauge, and Markdown, which automatically convert our data into their respective visualization formats. For example, in the image below we’ve created a Top N simple visualization that displays top spaces where our CPU is used.

In sum, Visual Builder is a great sandbox for experimentation with your data with which you can produce great time series, gauges, metrics, and Top N lists.

Creating a Line Chart

A line chart is a basic type of chart that represents data as a series of data points connected by straight line segments. In the image below, you can see a line chart of the system load over a 15-minute time span. To create this chart, in the Y-axis, we used an average aggregation for the system.load.1 field that calculates the system load average. You are not limited to the average aggregation, however, because Kibana supports a number of other Elasticsearch aggregations including median, standard deviation, min, max, and percentiles, to name a few. You can play with them to figure out whether they work fine with the data you want to visualize.

After defining the metric for the Y-axis, specify parameters for our X-axis. In this example, we use data histogram for aggregation and the default @timestamp field to take timestamps from. After entering our parameters, click on the 'play' button to generate the line chart visualization with all axes and labels automatically added. That's it! Now save the line chart to the dashboard by clicking 'Save' link in the top menu.

Creating a Pie Chart

A pie chart or a circle chart is a visualization type that is divided into different slices to illustrate numerical proportion.

In this example, we’ll be using a split slice chart to visualize the CPU time usage by the processes running on our system. The first step to create our pie chart is to select a metric that defines how a slice’s size is determined. Kibana pie chart visualizations provide three options for this metric: count, sum, and unique count aggregations (discussed above). For our goal, we are interested in the sum aggregation for the system.process.cpu.total.pct field that describes the percentage of CPU time spent by the process since the last update. After you specify the metric, you can also create a custom label for this value (e.g., Total CPU usage by the process).

The next step is to define the buckets. We will use a split slices chart, which is a convenient way to visualize how parts make up the meaningful whole. For our buckets, we need to select a Terms aggregation that specifies the top or bottom n elements of a given field to display ordered by some metric. In our case, we’ll display 7 top processes running on our system ( system.process.name field) in terms of CPU time usage. The metric used to display our Terms aggregation will be the sum of the total CPU time usage by an individual process defined above. That's it!

Now, as always, click play to see the resulting pie chart.

As you see, Kibana automatically produced seven slices for the top seven processes in terms of CPU time usage. The size of each slice represents this value, which is the highest for supergiant and chrome processes in our case. We can now save the created pie chart to the dashboard visualizations for later access.

Note: when creating pie charts, remember that pie slices should sum up to a meaningful whole. In our case, this rule is followed: the whole is a sum of the CPU time usage by top seven processes running our system.

Making an Area Chart

Area charts are just like line charts in that they represent the change in one or more quantities over time. The difference is, however, that area charts have the area between the X-axis and the line filled with color or shading.

In Kibana, the area chart’s Y-axis is the metrics axis. It supports a number of aggregation types such as count, average, sum, min, max, percentile, and more. In the example below, we drew an area chart that displays the percentage of CPU time usage by individual processes running on our system.

The metrics defined for the Y-axis is the average for the field system.process.cpu.total.pct, which can be higher than 100 percent if your computer has a multi-core processor. The next step is to specify the X-axis metric and create individual buckets. In the X-axis, we are using Date Histogram aggregation for the @timestamp field with the auto interval that defaults to 30 seconds. As an option, you can also select intervals ranging from milliseconds to years or even design your own interval.

Once we’ve specified the Y-axis and X-axis aggregations, we can now define sub-aggregations to refine the visualization. For this example, we’ve selected split series, a convenient way to represent the quantity change over time. Now, in order to represent the individual process, we define the “Terms” sub-aggregation on the field system.process.name ordered by the previously-defined CPU usage metric. In this bucket, we can also select the number of processes to display. That's it! Now we can save our area chart visualization of the CPU usage by an individual process to the dashboard.

Conclusion

Elasticsearch powered by Kibana makes data visualizations an extremely fun thing to do. Recent Kibana versions ship with a number of convenient templates and visualization types as well as a native Visualization Builder. With these features, you can construct anything ranging from a line chart to tag clouds leveraging Elasticsearch’s rich aggregation types and metrics.

In the next tutorials, we will discuss more visualization options in Kibana, including coordinate and region maps and tag clouds.

Originally published at qbox.io.