Data Visualization with the ELK Stack

Matthijs Mali
Nov 6, 2017 · 7 min read

This tutorial is part of a another tutorial. It does not cover installing or configuring the ELK stack. It does learn you how to input a CSV file into Logstash. I hope this guide is helpful. If you have any questions. Feel free to reach out to me.

When you know what a terminal is, what the ls command does and have seen (even a little) software development. You are the target audience :)

Time to finish tutorial: Approx. 20 to 30 minutes.


Throwing logfiles into Elasticsearch is one thing. But how about using Kibana to create dashboard to explore other types of datasets. This short tutorial covers how to do that. Try to read through and understand what is happening, do not just lazily copy-paste.

Interesting datasets can be found all over the internet. Some good places to start looking are:

When looking for an interesting open dataset, keep in mind that you have to import these into Elasticsearch. For me, that means having a few costraints. I want CSV Files and a total amount of rows less than 20k. Since I am running a small instance of elasticsearch with Docker.

Dataset 1: Fifa17 Players ⚽

I found a nicely prepared single file CSV on Kaggle. For purpose of this tutorial, a friend of me hosted the CSV file on his github.

Downloading CSV

First, let’s open up a connection into our ELK docker container. If you don’t have this container, please see my other tutorial on Docker and ELK.

docker exec -it elkx /bin/bash -l

Create a new directory where we will store the CSV file. And hop into that directory.

mkdir /var/csv/

Next, use Curl to download the CSV.

curl -O https://raw.githubusercontent.com/jwitteveen/parkeerplaats/master/FullData.csv

Note that the -O flag is used. This downloads the file into the directory. If you leave out this flag, 17k lines of Fifa player will be swooping through your terminal ;)

For our CSV to work, we need to remove the headers. Use VIM to remove the first line of text from the file.

vim FullData.csv

When you have VIM open, make sure the cursor is located on the first line (which it is by default) and press shift+D. This Deletes the entire line. Press i to step into insert mode. Press delete once to remove the empty line. Press escape to leave insert mode. Now you can :wq out of VIM.


Preparing Logstash

Now that we have a 17K lines CSV file, it’s time to prepare logstash. Create a new configuration file in the logstash conf.d directory.

vim /etc/logstash/conf.d/01-fifaplayers.conf

Step by step we will fill this file up with the right input.

input {
file {
path => "/var/csv/FullData.csv"
start_position => "beginning"
}
}

This makes sure that Logstash knows where to get the input. It takes it from a file, located at the directory indicated at path. And it starts indexing from the beginning. For more information, see the Logstash File input plugin documentation.

Next, we need to filter the data. Luckily, Logstash has a great csv filter plugin.

filter {
csv {
columns => ["Name","Nationality","National_Position","National_Kit","Club","Club_Position","Club_Kit","Club_Joining","Contract_Expiry","Rating","Height","Weight","Preffered_Foot","Birth_Date","Age","Preffered_Position","Work_Rate","Weak_foot","Skill_Moves","Ball_Control","Dribbling","Marking","Sliding_Tackle","Standing_Tackle","Aggression","Reactions","Attacking_Position","Interceptions","Vision","Composure","Crossing","Short_Pass","Long_Pass","Acceleration","Speed","Stamina","Strength","Balance","Agility","Jumping","Heading","Shot_Power","Finishing","Long_Shots","Curve","Freekick_Accuracy","Penalties","Volleys","GK_Positioning","GK_Diving","GK_Kicking","GK_Handling","GK_Reflexes"]
separator => ","
}
# more stuff is coming here...
}

So, we tell Logstash to use the CSV filter. Give in all the column names and the separator character.

The # more stuff is coming here... is going to be quite a long strip of code. Basically the mutate plugin allows you to change the datatypes of certain fields. This is important, because by default all data from the CSV will be stored as strings. Making it impossible to do calculations (like means, averages etc.) in Kibana.

Okay, build the mutate plugin.

mutate {
convert => {"Contract_Expiry" => "float"}
convert => {"Rating" => "integer"}
# more to come...
}

Quite simple, right? The mutate plugin uses the convert option, where you input the original field name together with the datatype you want it to represent. This has to be done for all the columns.

Quick tip. If you ever need to do this again for a lot of columns. Use the Notepad++ / Sublime / Atom search and replace option. Search for , and replace it by => "float"}, convert => {" Then all you have to do is add enters.

In the end, it should look something like this:

filter {
csv {
columns => ["Name","Nationality","National_Position","National_Kit","Club","Club_Position","Club_Kit","Club_Joining","Contract_Expiry","Rating","Height","Weight","Preffered_Foot","Birth_Date","Age","Preffered_Position","Work_Rate","Weak_foot","Skill_Moves","Ball_Control","Dribbling","Marking","Sliding_Tackle","Standing_Tackle","Aggression","Reactions","Attacking_Position","Interceptions","Vision","Composure","Crossing","Short_Pass","Long_Pass","Acceleration","Speed","Stamina","Strength","Balance","Agility","Jumping","Heading","Shot_Power","Finishing","Long_Shots","Curve","Freekick_Accuracy","Penalties","Volleys","GK_Positioning","GK_Diving","GK_Kicking","GK_Handling","GK_Reflexes"]
separator => ","
}
mutate {
convert => {"Contract_Expiry" => "float"}
convert => {"Rating" => "integer"}
convert => {"Height" => "integer"}
convert => {"Weight" => "integer"}
convert => {"Age" => "integer"}
convert => {"Weak_foot" => "integer"}
convert => {"Skill_Moves" => "integer"}
convert => {"Ball_Control" => "integer"}
convert => {"Dribbling" => "integer"}
convert => {"Marking" => "integer"}
convert => {"Sliding_Tackle" => "integer"}
convert => {"Standing_Tackle" => "integer"}
convert => {"Aggression" => "integer"}
convert => {"Reactions" => "integer"}
convert => {"Attacking_Position" => "integer"}
convert => {"Interceptions" => "integer"}
convert => {"Vision" => "integer"}
convert => {"Composure" => "integer"}
convert => {"Crossing" => "integer"}
convert => {"Short_Pass" => "integer"}
convert => {"Long_Pass" => "integer"}
convert => {"Acceleration" => "integer"}
convert => {"Speed" => "integer"}
convert => {"Stamina" => "integer"}
convert => {"Strength" => "integer"}
convert => {"Balance" => "integer"}
convert => {"Agility" => "integer"}
convert => {"Jumping" => "integer"}
convert => {"Heading" => "integer"}
convert => {"Shot_Power" => "integer"}
convert => {"Finishing" => "integer"}
convert => {"Long_Shots" => "integer"}
convert => {"Curve" => "integer"}
convert => {"Freekick_Accuracy" => "integer"}
convert => {"Penalties" => "integer"}
convert => {"Volleys" => "integer"}
convert => {"GK_Positioning" => "integer"}
convert => {"GK_Diving" => "integer"}
convert => {"GK_Kicking" => "integer"}
convert => {"GK_Handling" => "integer"}
convert => {"GK_Reflexes" => "integer"}
}
}

Note that the mutate plugin is still part of the filter plugin.

So, that part is finished. Now you have a configuration with input and a filter. But we still need to output it correctly. If we do not define this step, Logstash will step to the next configuration file in the line and output it in the way described there. Which is not what we want.

At the end of our configuration file, add the following lines:

output {
elasticsearch {
action => "index"
hosts => ["localhost"]
manage_template => false
index => "fifaplayers"
document_type => "%{[@metadata][type]}"
user => "elastic"
password => "changeme"
}
}

Here, we use the elasticsearch output plugin. We make sure it starts indexing the items. At the localhost (as it is on the same docker container). A template is not required, so that is set to false. The index is going to be named fifaplayers. Making it easy to find the data from Kibana.

Lastly, the default elastic username and password are given.

The final file will look like this:

input {
file {
path => "/var/csv/FullData.csv"
start_position => "beginning"
}
}
filter {
csv {
columns => ["Name","Nationality","National_Position","National_Kit","Club","Club_Position","Club_Kit","Club_Joining","Contract_Expiry","Rating","Height","Weight","Preffered_Foot","Birth_Date","Age","Preffered_Position","Work_Rate","Weak_foot","Skill_Moves","Ball_Control","Dribbling","Marking","Sliding_Tackle","Standing_Tackle","Aggression","Reactions","Attacking_Position","Interceptions","Vision","Composure","Crossing","Short_Pass","Long_Pass","Acceleration","Speed","Stamina","Strength","Balance","Agility","Jumping","Heading","Shot_Power","Finishing","Long_Shots","Curve","Freekick_Accuracy","Penalties","Volleys","GK_Positioning","GK_Diving","GK_Kicking","GK_Handling","GK_Reflexes"]
separator => ","
}
mutate {
convert => {"Contract_Expiry" => "float"}
convert => {"Rating" => "integer"}
convert => {"Height" => "integer"}
convert => {"Weight" => "integer"}
convert => {"Age" => "integer"}
convert => {"Weak_foot" => "integer"}
convert => {"Skill_Moves" => "integer"}
convert => {"Ball_Control" => "integer"}
convert => {"Dribbling" => "integer"}
convert => {"Marking" => "integer"}
convert => {"Sliding_Tackle" => "integer"}
convert => {"Standing_Tackle" => "integer"}
convert => {"Aggression" => "integer"}
convert => {"Reactions" => "integer"}
convert => {"Attacking_Position" => "integer"}
convert => {"Interceptions" => "integer"}
convert => {"Vision" => "integer"}
convert => {"Composure" => "integer"}
convert => {"Crossing" => "integer"}
convert => {"Short_Pass" => "integer"}
convert => {"Long_Pass" => "integer"}
convert => {"Acceleration" => "integer"}
convert => {"Speed" => "integer"}
convert => {"Stamina" => "integer"}
convert => {"Strength" => "integer"}
convert => {"Balance" => "integer"}
convert => {"Agility" => "integer"}
convert => {"Jumping" => "integer"}
convert => {"Heading" => "integer"}
convert => {"Shot_Power" => "integer"}
convert => {"Finishing" => "integer"}
convert => {"Long_Shots" => "integer"}
convert => {"Curve" => "integer"}
convert => {"Freekick_Accuracy" => "integer"}
convert => {"Penalties" => "integer"}
convert => {"Volleys" => "integer"}
convert => {"GK_Positioning" => "integer"}
convert => {"GK_Diving" => "integer"}
convert => {"GK_Kicking" => "integer"}
convert => {"GK_Handling" => "integer"}
convert => {"GK_Reflexes" => "integer"}
}
}
output {
elasticsearch {
action => "index"
hosts => ["localhost"]
manage_template => false
index => "fifaplayers"
document_type => "%{[@metadata][type]}"
user => "elastic"
password => "changeme"
}
}

Save the file.


Getting the data into Elasticsearch

Restart logstash using the service logstash restart command. And your system should slowly be making a bit more noise. Depending on the amount of available memory.

To be sure that there are no mistakes in the configuration file that we wrote, keep an eye on the logs from logstash itself. You can easily use the tail function to do so. But keep in mind that the terminal in which you issue this command will be occupied by it.

If errors are found in the configuration file, something like this might popup: Cannot create pipeline {:reason=>"Expected one of #, {, } at line 61, column 15 (byte 2552) after output {\n elasticsearch {\n action => \"index\"\n hosts => \"[\""}It indicates the location of the error like most stacktraces.

tail -f /var/log/logstash/logstash-plain.log

The elasticsearch logs might also be of interest. These can be found in /var/log/elasticsearch/.

Congratulations. You’ve done the least fun part. The data should be flowing into elasticsearch now. Making it available for Kibana.


Visualizing the dataset in Kibana

Browse to http://localhost:5601/ and login with elastic / changeme.

In the menu choose management and click the Create Index Pattern on the top left of this new page. Fill out the form just like the following image.

Creating the index pattern

Using the visualization tool you can now create visualizations on this data. Feeling uninspired? Try recreating the dashboard i made.

Inspirational Dashboard for the Fifa Players dataset

Thanks!

Did you finish the entire setup? Do a quick clap and let me know what your further plans are. This way I know whether I should do more of these!

Matthijs Mali

Written by

UX Consultant with love for data and technology.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade