Python: Async Logging to an ELK Stack

Mike Taylor
Jun 27, 2020 · 7 min read

Ever wanted to set up an ELK stack and get python logging to get there with minimal fuss? Well this is the article for you. We’ll be using a dockerized ELK stack, so if you want to follow along, make sure you install docker and docker-compose. We’ll be using a jupyter notebook to do our logging testing. I assume you’re familiar with installing python virtual environments and using them, but if you’re not please feel free to brush up on the documentation!

Overview

  1. Make sure docker and docker-compose are installed
  2. Set up a virtual environment
  3. Install the dependencies
  4. Configure Logstash
  5. Start your ELK stack
  6. Start up your jupyter notebook
  7. Set up your logger and send a log entry or two
  8. Set up your logging indices
  9. Check out your logs in your ELK stack!

Install docker and docker-compose

We aren’t going to walk through this, but the links are provided here:

Set up a virtual environment

Just to make sure we’re not going crazy and installing things that you don’t want to have sitting around your system forever, we’ll start by setting up a virtual environment.

mkdir python-logging-elk-stack
cd .\python-logging-elk-stack
python -m venv venv

Once you have your virtual environment created, activate it

PS C:\Users\bubth\Development\medium\python-logging-elk-stack> .\venv\Scripts\Activate.ps1
(venv) PS C:\Users\bubth\Development\medium\python-logging-elk-stack>

Install the dependencies

With your virtual environment activated, you can install two libraries — one is required, and one is just for this tutorial. For our logging, we’re going to stick with newer features and use an async logging handler called python-logstash-async.

pip install python-logstash-async

That’s the only *true* dependency for the tutorial, but for those of you who aren’t familiar with Jupyter Lab, that’s what we’re going to use to test our ELK setup.

pip install jupyterlab

Last but not least, we’re going to clone the docker-elk repository that we’ll be using for our dockerized ELK deployment. If you already have an existing ELK stack, or want to set one up manually, feel free to reference the ELK stack documentation, and you can add a new pipeline just as easily. Here’s the clone command!

git clone https://github.com/deviantony/docker-elk.git
Cloning into 'docker-elk'...
remote: Enumerating objects: 1768, done.
Receiving objects: 100% (1768/1768), 420.15 KiB | 1.38 MiB/s, done.
Resolving deltas: 100% (720/720), done.

If you’re following line for line, you should have a virtual environment directory, and a docker-elk directory:

(venv) PS C:\Users\bubth\Development\medium\python-logging-elk-stack> lsDirectory: C:\Users\bubth\Development\medium\python-logging-elk-stackMode                LastWriteTime         Length Name
---- ------------- ------ ----
d----- 6/26/2020 11:06 PM docker-elk
d----- 6/26/2020 10:57 PM venv

Configure Logstash

Open up the docker-elk\logstash\pipeline\logstash.yml file that’s in your new docker-elk folder in your favorite editor and let’s take a look at that YAML file:

input {
tcp {
port => 5000
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
}
}

We’re going to make one small change here and make sure the input codec used is json so that it parses the logs properly:

input {
tcp {
port => 5000
codec => json
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
}
}

Please change your password in production!

Start your ELK stack

Now that we have our configuration set up to parse our python logs properly, let’s start our elk stack with the docker compose command!

cd .\docker-elk\
docker-compose up

This may take a minute or two, depending on your system and how you have docker configured. Once it’s up and running you should be able to log in to the ELK stack at http://localhost:5601/ and see a login page like this:

The default login if you havent changed it is elastic for the username and changeme for the password. Congratulations — you have an ELK stack to play with!

Start up your jupyter notebook

Back in python land, we’ll start up our Jupyter notebook with the jupyter lab command. That should open up to http://localhost:8888/lab, where you’ll be presented with some fun options.

If you haven’t installed fancy things, you should have the option to start a Notebook with a Python 3 kernel:

f you’re still using Python 2 for this tutorial, shame on you — start over and install Python 3 — because you need to get your shit together.

Set up your logger and send a log entry or two

The ELK stack won’t have any “indices” for us to configure for logging, so we have to start by creating some data. Our first cell will be the following code:

import logging
from logstash_async.handler import AsynchronousLogstashHandler
host = 'localhost'
port = 5000
# Get you a test logger
test_logger = logging.getLogger('python-logstash-logger')
# Set it to whatever level you want - default will be info
test_logger.setLevel(logging.DEBUG)
# Create a handler for it
async_handler = AsynchronousLogstashHandler(host, port, database_path=None)
# Add the handler to the logger
test_logger.addHandler(async_handler)

Yes, you can set up loggers with dict configs, file configs, etc — if you’re smart enough to know what those are, you should be able to follow the logging file config setup example that’s provided by the author of the python-logstash-async package.

In your second cell, you’ll enter the following code to create some log entries:

import time
while True:
test_logger.info("This is an info message at %s", time.time())
time.sleep(0.5)

These two “cells” can be executed by holding shift and pressing enter, or by pressing the run button at the top of the screen

Set up your logging indices

Now that you’re generating some data for your Logstash instance to hand off to your ELK stack, you’ll have an index to configure! Back in your http://localhost:5601/ tab, you’ll see a little settings icon on the bottom left to get into the management configs.

Click this button

In the management menu, the Kibana section has an Index Patterns option where you can create a new index pattern. Index patterns are sets of data that kibana will use for it’s source data.

If you navigated here while playing around with your ELK stack before we started logging to it, there would be nothing in this view — but now that we’ve populated logstash with some logs, you’ll see a logstash-<date> index. Since we want to have all those logstash indieces in our logs, we’ll use the wildcard index pattern logstash* to match anything logstash sends our way.

Once you hit next, you’ll be given the option to set a time filter — you definitely want this, since it’s how you can filter for events before and after timeframes, etc. Set the timestamp as the time filter field and then create the index.

Now that you have a logstash index, you can edit the fields that are available and add new ones to the log lines — for now, leave these things alone and move on.

Navigate to the logs section by clicking the logs icon on the left hand side, and then click the settings tab at the top — that will bring you to the logging configuration section.

In the settings configuration you’ll want to add a logging configuration — we’re going to name ours python-logs and we want it to be all of our logstash logs, so we’ll use the logstash* index match again. The recommended value is a great idea if you’re wanting to set up python logging to files and use filebeat to send those files — but since we want to get up and running and log over TCP, we’ll keep our shit simple.

NOTE: This is probably not the right option for a larger setup unless you know the advantages and disadvantages of TCP logging, but it will get you started with logging and monitoring for a while until you can hire a devops guy to worry about all that.

Once we’ve defined our log indices and name, we can add columns if we want to to our logs. Don’t remove the event.dataset, because it will stop you from playing around with some of the neat machine learning options for analytics later on.

Click the apply button and you’re ready to go!

Check out your logs in your ELK stack!

Navigate back to your logs stream setting, and you should see your logs from your python loop!

Thanks for reading — if you want more details on other ways to configure things (Like file configs and filebeat) leave a comment and I’ll take some time to put together another article for it!

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium