User Monitoring with the ELK Stack

Matthijs Mali
Nov 4, 2017 · 13 min read

My name is Matthijs Mali, UX Consultant. I have experience in design and technological field. Frustrations with Google Analytics made me dive into the ELK stack. I hope this guide is helpful. If you have any questions. Feel free to reach out to me.

When you know what a terminal is, what the ls command does and have seen (even a little) software development. You are the target audience :)

Time to finish tutorial: Approx. 1,5 to 2 hours.

Getting started

This tutorial gets you started with gathering user metrics from your application into Kibana. In a few hours this will cover the basic installation of the ELK-stack in a docker container, the configuration of Elasticsearch, Logstash and Kibana and capturing logfiles from a logfile.

The final situation will be as in the following diagram. You could see the Application Container as the application server currently containing the application you work on. The ELKX Container as a new service that could be running within your organization. Whenever you are ready, let’s get preparations done.

Diagram of the situation we will produce

Preparations

Install Docker

Choose your platform on https://www.docker.com/community-edition and download Docker.

This manual does not cover Docker installation instructions. If you cannot get it to work, refer to the docker community

When using the ELK stack, enough memory on your Docker environment is required. Typically, 3GB should work, but just to be on the safe side, we are going to require 4GB. Also, to get it running properly on Linux and Windows some additional settings have to be made.

Linux users

Linux user please see: https://docs.docker.com/engine/installation/linux/linux-postinstall/#manage-docker-as-a-non-root-user

Also execute the following command:

sudo sysctl vm.max_map_count

If this returns < 262144 then please execute:

sudo sysctl -w vm.max_map_count=262144

You can now run the installation.

Windows users

If you are on Windows and docker complains about Hyper-V / Virtualization please check your Virtualization settings, if it is enabled. Open the “Turn windows features on or off” panel and deselect the Hyper-V checkbox. Restart. Open the panel again and check the Hyper-V checkbox. Restart again. Everything should be working now.

Open the Docker settings panel and look for the advanced settings. Be sure to extend the amount of memory that docker can use to more than 3 GB. If you keep it on the default 2 GB, Elasticsearch will crash and this entire tutorial cannot be finished.

When you are done installing and configuring Docker, you can advance to the next step.

Get the ELKX image

Elasticsearch, Logstash and Kibana can be quite a pain to install. Fortunately for you there are well prepared docker images, that get you up and running in no-time. There are many ELK images available. For this article, we will be using the ELKX docker image created by Sébastien Pujadas.

sebp/elkx contains Elasticsearch, Logstash, Kibana and X-pack. sebp/elk does not contain X-pack.

Open up a terminal and type the following:

Note that sudo is not included in these commands. Conclude for yourself whether you need it :)

docker pull sebp/elkx

The docker image is now being pulled to your system.


Image & Container configuration

Creating a new Docker Image

For our setup, we are going to need a seperate container. This will pretend to be our application. That way we can learn how to connect this application to our ELKX instance.

Download the following file in a new clean directory of your choice.

com.alten.workshop-1.0-SNAPSHOT.jar ( Right click > Save as )

Thanks to Joel Witteveen for creating this java tool that creates random logging and setting up the dockerfile below.

Next, create a new textfile named Dockerfile in the same directory and give it the following contents:

FROM debian:jessie# add webupd8 repository
RUN \
echo "===> add webupd8 repository..." && \
echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" | tee /etc/apt/sources.list.d/webupd8team-java.list && \
echo "deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" | tee -a /etc/apt/sources.list.d/webupd8team-java.list && \
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys EEA14886 && \
apt-get update && \
\
\
echo "===> install Java" && \
echo debconf shared/accepted-oracle-license-v1-1 select true | debconf-set-selections && \
echo debconf shared/accepted-oracle-license-v1-1 seen true | debconf-set-selections && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --force-yes oracle-java8-installer oracle-java8-set-default && \
\
\
echo "===> clean up..." && \
rm -rf /var/cache/oracle-jdk8-installer && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN apt-get update
RUN apt-get install curl -y
RUN apt-get install net-tools -y
#filebeat
RUN \
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.6.3-amd64.deb && \
dpkg -i filebeat-5.6.3-amd64.deb
WORKDIR /appADD . /appCMD ["java", "-jar", "com.alten.workshop-1.0-SNAPSHOT.jar"]

What this is? More information on these Docker composefiles. Or check out some examples.

Now from the directory should look like this:

randomlogger-image-files
- com.alten.workshop-1.0-SNAPSHOT.jar
- Dockerfile

From that directory type:

docker build -t randomlogger .

Docker docs on the build command

the -t flag is shorthand for — tag and allows you to name we are creating. In this case, the name will be “randomlogger”. But feel free to use anything you want.

Docker will now build the docker image and go through the steps defined in the dockerfile you just created. First it will update the repository, install Java, perform a little cleanup and then install some tools. Note that it also install Filebeat. A tool that watches log files, and sends them to the ELK stack.

Awesome, you just created your (first?) docker image!

You can run a new container based on our image with the command:

docker run -d --name randomlogger randomlogger tail -f /dev/null

Docker docs on the run command

The -d flag is shorthand for detach and means the container will run in the background.

The -name flag allows to name the container, so it is easily identifiable.

The second randomlogger string is the name of the image we are using to create our container.

Lastly, the tail -f /dev/null ensures that the container will remain running. If you issue the entire command without that last part, the container will stop running almost directly after you press enter.

Connecting to the Randomlogger container

The randomlogger container is running in the background, so from the same terminal we can connect into that container.

docker exec -it randomlogger /bin/bash -l

Docker docs on the exec command

You now have a terminal within the docker container, from where we have to figure out the current IP address. Type:

ifconfig

And the interface configuration should pop-up. The first IP-address is the ip address of this docker container. It might be a good idea to write this down.


Running the ELKX container

Let’s run the ELKX image you pulled from the Docker repository.

Open up a new terminal and type the following command:

docker run -p 5601:5601 -p 9200:9200 -p 5044:5044 -p 9300:9300 -it --name elkx sebp/elkx

Docker docs on the run command

The -p flag is shorthand for — publish and publishes the ports to the host.

The terminal will respond with some messages that stuff is starting up. And it is waiting for Elasticsearch to start. After a few seconds it should start and log messages from Elasticsearch, Logstash and Kibana should become visible.

The ELKX image is configured in such a way that starting it will always show these messages. You can leave this terminal if you want, or keep it open for later log reference. Don’t worry. The ELK stack will keep running when you quit the terminal.

Connecting to the ELKX container

Just like the Randomlogger, we are going to get the IP from the ELKX container. First, exec into the container:

docker exec -it elkx /bin/bash -l

Now, run the interface configuration again.

ifconfig

If ifconfig does not work. Install it first via the apt-get install net-tools command.

Write down the IP for the ELKX container.


Configuring Filebeat

Now that we have two containers running, we can start the filebeat configuration.

Open up a terminal on the Randomlogger container and install VIM.

Docker containers come very bare bone. Only a few amount of tools are installed. For the purpose of learning, VIM was not available yet in this image. Look at the Dockerfile again, you’ll notice that we install net-tools and curl. You could add vim in there as well.

apt-get install vim

When Vim is installed, run the following command to edit the Filebeat configuration:

vim /etc/filebeat/filebeat.yml

Vim can be quite a pain when using it for the first time. Please take a few minutes to read through this guide.

This file contains the following parts and is heavily commented. I recommend leaving the comments in, as they give you a good guideline on the basic settings that can be made.

In the Prospectors part, add the followings lines:

# More lines above...filebeat.prospectors:# Here are some comments- input_type: log
paths:
- /tmp/testlogfiles/request/*
fields: {log_type: request}
# - /var/log/*.log# More lines below here...

In the outputs part, place a comment pound (#) in front of elasticsearch, uncomment logstash and replace the hostname with the IP of the ELKX container IP that you wrote down. It should look something like this.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
# Array of hosts to connect to.
# hosts: ["localhost:9200"]
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["IP_ADDRESS_OF_THE_ELKX_CONTAINER:5044"]

Save the document and quit using the VIM command:

:wq <enter>

Configuring Logstash

Now that we can configure logstash in the ELKX container. The configuration allows filebeat to send log entries to logstash and we are going to tell logstash how to transform the data that is received.

Logstash basically does the following:

Open up a terminal and exec into the elkx container. Navigate to the /etc/logstash/conf.d/directory and type lsto display the files in there. Note that these files are used as a pipeline by logstash in alphabetical order.

Firstly, edit the 02-beats-input.conffile by opening it in vim. Set SSL to false and save the file.

Now create a new file with the name 12-randomlog.conf and open it with VIM.

vim 12-randomlog.conf

Put the following configuration in this file:

filter {
if ([fields][log_type] == "request") {
grok {
match => { "message" => "%{IP:client} \[%{TIMESTAMP_ISO8601:timestamp}\] - %{WORD:username} - %{URIPATHPARAM:request} %{WORD:method} %{NUMBER:response}" }
}
}
}

Copy-pasting this? Make sure the quotations are correct. They should not be curly. Unsure? Type the configuration yourself.

Save the file :wq and restart the logstash service by issuing the service restart command:

service logstash restart

Next, we tail the logstash logfile. To make sure everything is running correctly when the stream from filebeat is started.


Start logging! 🌳

Now, we can start logging. From the Randomlogger container, run filebeat.sh in debug mode.

filebeat.sh -e

Open one more terminal in the Randomlogger container and run the randomlogger.

java -jar com.alten.workshop-1.0-SNAPSHOT.jar

When you look at the filebeat debug terminal, it will show that it found some new files. And it tries sending them to Logstash.

Are you seeing connection errors? Be sure that you went through all the steps above. Check if SSL is off and the IP is correct.


Exploring Data 📡

So, now that we have two containers running and some random log files generated. These logfiles are monitored by Filebeat and pushed into Logstash. You made a Grok filter that transforms a message of the type request into something that Elasticsearch can index.

Now we can hop into Kibana and start visualizing this data.

Logging into Kibana

Open a browser and visit http://localhost:5601.

Seeing a big red box below the login screen? Make sure that Elasticsearch is running. Unfortunately, you cannot continue before that works. Visit the ELK community if the problem persists.

Login with elastic / changeme to access the interface.

Changing the login details is recommended. For the ease of this tutorial we keep using the elastic/changeme combination.

Creating an Index Pattern

As soon as we logged in, we have to define an index pattern. This index pattern roughly says, take everything with from the index named filebeat- and include everything you find after that word. * is a wildcard. Select the right timestamp. This is what will show up in our dashboard as the timefilter. Selecting the wrong timestamp makes it impossible to filter data in the right way. When you make this mistake you will soon enough figure out what is going on.

Choosing the Index Pattern

Click the Create button that appears. Kibana loads a page with an overview of the index patterns. This was the first one ever. So it’s still quite empty. For now, click the Discover menu item on the top left side.


Data Discovery 🔎

Oh yeah! You just connected a logfile, through filebeat, logstash and elasticsearch to the Kibana front-end. Congrats! You will see a screen which roughly looks like below. Without those awful red boxes offcourse. Let me briefly explain:

Screen overview of the Discover Page

A. Simple toolbar
Took my a while to find on some occasions, but you’ll find save features here. Also for other pages. Another important thing here is the date selector. Sometimes, when thinking there was no data coming into Elasticsearch, it was because the dateselector was on a wrong setting.

B. Query box
Handy when searching for specific output. In this case, I searched for username, which is a field defined in the Logstash filter with the value Sydney. When you press enter, is searches and displays the results. Searching for strings with spaces can be done by using quotation marks. Also you can combine queries. Like so: username:"Sydney Smith" method:"DELETE"

C. Fields overview
This shows all the fields available in this index pattern. Note that the selected index pattern (filebeat-*) is in the dark gray area. Click a field to get a quick overview of the distribution. Try clicking the field method.

D. Timeline
This shows the occurences over the selected time. Simply click and drag to zoom in on a specific area. Note that the date from the toolbar also changes.

E. Logline viewer
This area shows the individual loglines. Click on the small arrow left of the date to expand this line.

That’s about it for the discover page.

Now feel free to browse around on this page for a little while. Try to click through all of the described page components. An idea to look for would be when I (Matthijs Mali, I am also in these random logs, with the username Mali), accessed the request “/api/clients/all”. Good luck :)


Creating visualizations 📊

Let’s create some visualizations that we can use on a dashboard. Imagine we want to see how many unique users we have and see who the most active users are within a certain point in time.

Click the Visualize menu item. A new screen tells us that there are not visualizations yet. Press the button to create one.

Top 10 Active Users

For our first visualization, let’s make a simple Horizontal bar. Choose the filebeat-* index to work from.

Most of the stuff we do here, I figured out by fiddling around with the visualization tool. Feel free to follow the steps, but keep in mind to try a lot. And see what happens. There is no descructive action on your data possible. So you can always restart creating a new visualization and removing the old ones.

Press the button labeled X-Axisnow choose Terms in the Aggregation dropdown. In the field that appears, choose username.keyword. Now press the blue/white playbutton on the topside of this panel. The graph will change and show 5 most active users. Cool huh!?

Adjust the size textbox, and set it to 10. Press the play button again (or press enter, when still in the textbox). The result should be something like this:

Creating the Top 10 Active Users chart

Now, on top of the screen choose Save. Give your visualization a name (“Top 10 Active Users”) and click the new save button.

Now click the Visualize menu item again. The Top 10 Active Users visualization is now visible. Click the plus button to add another one.

Unique Users

Choose Metric and pick the filebeat-* index pattern again. A big number is shown. Let’s change this into something more interesting, as now it just shows the total amount of logs in the time selection.

Click the small arrow next to metric. For the Aggregation dropdown, choose Unique Count and in the Field choose Username.keyword . Press the play button. Bam, there we have a count on the unique amount of users.

Go through the same save principle as before.


Creating a dashboard

Now that we have some visualizations, let’s create a dashboard to show them on. This way we can interact with all graphs at the same time.

Go to the Dashboard menu item and click the Create a dashboard button. A text shows you that the dashboard is still empty. Click the add button on top of the screen (next to the time filter). A box is shown with you visualizations.

Click them both to see them appear on the dashboard. Click the add button again to close the list of visualizations.

In the toolbar, choose Save. Pick a name for the dashboard (I chose “Users”) and save it.

Now you can play a bit with the timefilter to see the visualizations change. Or, even better. Click a bar next to a username and see the big number change.

Also notice, that when you click a user, it appears in the filter bar. Like so:

The Filter bar while searching for Nicolas

Hover your mouse over the filter, to see some options. Like temporarily disabling this filter, or saving it.


Have some fun

So, now that you got some of the Kibana basics. Why not try and recreate the following dashboard? Good luck.

A sample dashboard to recreate

Congratulations 🎉

You finished the main part of this tutorial. I’m currently working on expanding it a bit. And learning you how to put a large CSV file into Logstash. Which can be very useful for exploring large datasets. You can already peek at the new page, but keep in mind that it is not yet finished.

Thanks!

Did you finish the entire setup? Do a quick clap and let me know what your further plans are. This way I know whether I should do more of these!

Matthijs Mali

Written by

UX Consultant with love for data and technology.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade