Nginx logs monitoring using ELK stack & Docker.

Hello there, in this short article I’ll show a common way of monitoring Nginx logs by the ELK stack (Elasticsearch, Logstash, and Kibana). This tutorial will be useful for small and medium Web projects.

In my opinion, logs are the thing that never enough, for people who really wanna control their projects deep. In this case, collecting and analyzing Nginx logs can give you additional flexibility and time of reaction.

Sometimes I meet with an opinion that creating and supporting of ELK stack may be complicated for the beginners and peoples who don’t have any OPS or Sysadmin experience. But in fact, if you no need a big distributed Elasticsearch clusters and you only need collecting information from the few Nginx servers, this task becomes more easily. We can also make it more easily with Docker.

For this example, we need a few servers, one target server with Nginx from where we’ll collect logs and a second one server for the ELK installation.

ELK server specification will depend on the volume of logs that you want to gather, for the small to middle web servers, you can use VM with:

  • 1- 2 Core CPU
  • 2–3 GB of RAM
  • 30–50 GB of Disk space (also depends on deep of logs)

Also on ELK server you need to install Docker & Docker-compose first, please use official documentation for it:

After Docker & Docker Compose will be installed, we can start with creating ELK installation and local Nginx with basic authentication schema:

elk~# apt-get update && apt-get install nginx
elk~# mkdir /home/docker-elk && cd /home/docker-elk

Now we need to create directories for Elasticsearch & Logstash & Kibana with configs and Dockerfiles:

elk:/home/docker-elk# mkdir -p elasticsearch/config
elk:/home/docker-elk# vi ./elasticsearch/config/elasticsearch.yml
---
# Default Elasticsearch configuration.
cluster.name: "docker-cluster"
network.host: 0.0.0.0
# minimum_master_nodes, set to 1 to allow single node clusters
# Details: https://github.com/elastic/elasticsearch/pull/17288
discovery.zen.minimum_master_nodes: 1
# Use single node discovery in order to disable production mode and # avoid bootstrap checks
discovery.type: single-node
elk:/home/docker-elk# vi ./elasticsearch/Dockerfile
# https://github.com/elastic/elasticsearch-docker
FROM docker.elastic.co/elasticsearch/elasticsearch-oss:6.6.0
elk-srv:/home/docker-elk# mkdir -p logstash/{config,pipeline}
elk:/home/docker-elk# vi ./logstash/config/logstash.yml
---
## Default Logstash configuration.
http.host: "0.0.0.0"
path.config: /usr/share/logstash/pipeline
elk:/home/docker-elk# vi ./logstash/Dockerfile
# https://github.com/elastic/logstash-docker
FROM docker.elastic.co/logstash/logstash-oss:6.6.0
elk:/home/docker-elk# mkdir -p kibana/config
elk:/home/docker-elk# vi ./kibana/config/kibana.yml
---
## Default Kibana configuration from kibana-docker.
server.name: kibana
server.host: "0"
elasticsearch.url: http://elasticsearch:9200
elk:/home/docker-elk# vi ./kibana/Dockerfile
# https://github.com/elastic/kibana-docker
FROM docker.elastic.co/kibana/kibana-oss:6.6.0

After all, your docker-elk directory must look like this:

/elasticsearch
|__ Dockerfile
|__ /config
|__ elasticsearch.yml
---
/kibana
|__ Dockerfile
|__ /config
|__ kibana.yml
---
/logstash
|__ Dockerfile
|__ /config
| |__ logstash.yml
|___/pipeline

Also as we run our ELK stack in containers we need to use some persistent volume to store data from Elasticsearch, so let’s create it now:

elk:/home/docker-elk# docker volume create elasticsearch
elk:/home/docker-elk# docker volume list
DRIVER              VOLUME NAME
local elasticsearch

Now it’s time to create a docker-compose file for running all our containers together:

elk:/home/docker-elk# vi ./docker-compose.yml
version: '2'
services:

elasticsearch:
build:
context: elasticsearch/
volumes:
- ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro
- elasticsearch:/usr/share/elasticsearch/data/:rw
ports:
- "127.0.0.1:9200:9200"
- "127.0.0.1:9300:9300"
environment:
ES_JAVA_OPTS: "-Xmx1g -Xms1g"
networks:
- elk

logstash:
build:
context: logstash/
volumes:
- ./logstash/config/logstash.yml:/usr/share/logstash/config/logstash.yml:ro
- ./logstash/pipeline:/usr/share/logstash/pipeline:ro
- /home/proxy_logs:/home/proxy_logs:ro
ports:
- "1025:1025/udp"
- "127.0.0.1:5000:5000"
- "127.0.0.1:9600:9600"
environment:
LS_JAVA_OPTS: "-Xmx1g -Xms1g"
networks:
- elk
depends_on:
- elasticsearch
kibana:
build:
context: kibana/
volumes:
- ./kibana/config/:/usr/share/kibana/config:ro
ports:
- "127.0.0.1:5601:5601"
networks:
- elk
depends_on:
- elasticsearch
volumes:
elasticsearch:
driver: local

networks:
elk:
driver: bridge

A few words about this compose config file, you no need to expose all ports to public IP, so don’t forget to put the 127.0.0.1 in the ports section. In my config, only 1025 port from Logstash will be open on public IP, for collecting incoming data by UDP.

One more important thing in this config is LS_JAVA_OPTS environment variable for the Elasticsearch & Logstash containers, by this parameter we can set memory limits for them, you need set it regarding your system memory and amount of logs that you wanna process.

After we finished with main configs and preparing, let’s create a Logstash pipeline config file, where we’ll put information about input, output and GROK filter parameters:

elk:/home/docker-elk# vi ./logstash/pipeline/logstash.conf
input {
udp {
port => 1025
queue_size => 50000
type => "nginx"
}
}
filter {
grok {
match => {
"message" => "%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} \[%{HTTPDATE:[nginx][access][time]}\]\"%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}\" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]}\"%{DATA:[nginx][access][referrer]}\" \"%{DATA:[nginx][access][agent]}\"\"%{NUMBER:[nginx][access][request_time]}\" \"%{NUMBER:[nginx][access][upstream_connect_time]}\""
}
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{@timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
index => "nginx"
}
}

In this config we specified few things, first, it’s the input part with 1025 UDP port for receiving logs from the source Nginx server, then we configure a filter part with GROK pattern that will parse our incoming logs. At last, we configure the output with Elasticsearch parameters.

For my example, I used a bit different log format than standard Nginx log, because I added some additional fields in them. To find your own GROK pattern you can use https://grokdebug.herokuapp.com/ . Just take one string from your source Nginx log and check the GROK pattern parsed it OK.

At this stage we can try to start our docker containers and check they running OK, also we’ll create an Elasticsearch index after it.

elk:/home/docker-elk# docker-compose up -d
.... lot of output here at first time...
Creating network "dockerelk_elk" with driver "bridge"
Creating dockerelk_elasticsearch_1
Creating dockerelk_logstash_1
Creating dockerelk_kibana_1
elk:/home/docker-elk# docker ps 
check that all containers started with OK status
If something goes wrong, use docker logs container_id command to debug the issue or run docker compose without -d option.

Let's create an index in Elasticsearch for our Nginx logs then:

elk:/home/docker-elk# vi nginx_template.json
{
"index_patterns": ["nginx"],
"mappings": {
"doc": {
"properties": {
"nginx.access.body_sent.bytes": { "type": "integer" },
"nginx.access.remote_ip": { "type": "ip" },
"nginx.access.response_code": { "type": "keyword" },
"nginx.access.method": { "type": "keyword" },
"nginx.access.url": { "type": "keyword" },
"nginx.access.request_time": { "type": "float" },
"nginx.access.upstream_connect_time": { "type": "float" }
}
}
}
}
elk:/home/docker-elk# curl --header "content-type: application/JSON" -XPUT http://localhost:9200/nginx -d "$(cat nginx_template.json)"
{"acknowledged":true,"shards_acknowledged":true,"index":"nginx"}

elk:/home/docker-elk#
curl 'localhost:9200/_cat/indices?v'
health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green open .kibana_1 Z1zQB7QuSE-AYreXqw-gCg 1 0 0 0 261b 261b
yellow open nginx KohI6bU9TOKYzeyInpvVZQ 5 1 0 0 1.1kb 1.1kb

All looks OK, the nginx index was created and we almost ready for receiving logs.

Just for information if you need to delete any index from Elasticsearch you can use this command:

curl -X DELETE "localhost:9200/index_name"

After all, we must have a running ELK stack on the server, now we need to configure the previously installed Nginx as a reverse proxy with basic authentication for minimum security level. Running a Kibana at public IP without minimal protection is a bad idea, that’s why we started Kibana on localhost.

elk:/home/docker-elk# apt-get update && apt-get install apache2-utils
## Create a admin user for accessing Kibana with Nginx auth.
elk:/home/docker-elk# htpasswd -c /etc/nginx/.htpasswd admin
elk:/home/docker-elk# rm /etc/nginx/sites-available/default
elk:/home/docker-elk# vi /etc/nginx/conf.d/elastic.conf
server {
listen *:80;
server_name _;
location / {
proxy_pass http://localhost:5601;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd;
}
}
elk:/home/docker-elk# nginx -t 
elk:/home/docker-elk# systemctl restart nginx

At this point we have configured ELK with Nginx as a reverse proxy, open in your browser http://elk-server-ip-address and after you’ll put login/password you must see the Kibana main page:

Great, we are ready to receive logs from the source Nginx server and create some nice dashboards in Kibana then.

As you can see in Logstash pipeline config, we’ll use a UDP protocol for receiving logs from the remote server, our target Nginx will send logs like to the external syslog server.

Using a UDP protocol may provide losing of some part of logs, but it also make configuration of target server more easy as we no need any additional software on our production Nginx server. Also my experience show that there’s about 1–2% of data lose, even when you send logs from Europe server to USA for example. For me this is OK, and I’ll preferring this way. But if you’ll find a terrible data losing you always can use a Filebeat (part of ELK) to send logs from target server as canonical ELK documentation say.

OK, let’s login into our “production” Nginx server from where we wanna get logs and change the logging section in the nginx.conf, to add new custom log format and Logstash server IP:PORT:

##
# Logging Settings
##
# Enabling request time
log_format custom '$remote_addr - $remote_user [$time_local]'
'"$request" $status $body_bytes_sent'
'"$http_referer" "$http_user_agent"'
'"$request_time" "$upstream_connect_time"';
access_log /var/log/nginx/access.log custom;
access_log syslog:server=elk-server-ip:1025 custom;
error_log /var/log/nginx/error.log;

Reload Nginx now and wait till first logs will be received by Logstash, then choose index with name nginx in the Kibana:

Also, this means that we have received first logs data, and we can found them in the Discover section of Kibana.

My congratulation, we just configured own ELK stack, and now we can create a lot of nice dashboard for our needs, including GEO data and so on.

I hope this example was useful and will help you with making the first steps with ELK. And as I said at the beginning, this was not so hard at all. Well, I hope you’ll feel the same after completing this example;).

Good luck.