Using Elasticsearch Rollover to manage indices

Pavol Loffay
Jul 24, 2019 · 4 min read

In this article you will learn how to configure and use the Elasticsearch rollover feature in Jaeger. Note that this feature has been introduced in Jaeger 1.10.0.

Jaeger uses index-per-day pattern to store its data to Elasticsearch. It creates a new index for each day based on span’s timestamp. These indices have to be periodically removed by jaeger-es-index-cleaner cron job. Typically users keep data from one week up to one month which results in 7 or 30 indices only for spans.

Index-per-day pattern might not be effective if data or resources are not evenly distributed. For example indices which do not contain any data still allocate shards or an index might contain significantly more data than the others. These use-cases result in sub-optimal use of resources.

How rollover works

In contrast to index-per-day management rollover uses an index alias which can rollover to a new index based on configured conditions. In practice, this means there are two aliases one for reading and another for writing. The read alias points to a set of read-only indices and write alias to one write index.

These are the conditions used for rolling over to a new index:

  • max_age — the maximum age of the index
  • max_docs — the maximum number of documents the index should contain
  • max_size — the maximum estimated size of primary shards (since Elasticsearch 6.x)

The rollover REST API has to be called periodically, the conditions are evaluated only during the REST call and they are not stored in Elasticsearch. Therefore a cron job has to be configured to make use of this feature.

Configuration

The Jaeger configuration of rollover consists of several steps.

Before deploying Jaeger it is mandatory to initialize the feature by running jaeger-es-rollover docker image (assuming Elasticsearch runs on localhost). This command creates read/write aliases and write indices.

docker run -it --rm --net=host jaegertracing/jaeger-es-rollover:latest init http://localhost:9200

Run the command again with-e ARCHIVE=true if you want to use this feature also with the archive index.

Once this is done Jaeger can be deployed with rollover enabled --es.use-aliases=true and --es.archive.use-aliases=true (for archive storage):

docker run -it --rm --net=host -e SPAN_STORAGE_TYPE=elasticsearch jaegertracing/all-in-one:latest --es.use-aliases=true --es-archive.enabled=true --es-archive.use-aliases=true

Jaeger has been deployed and now it writes data to a write alias. The next step is to periodically execute rollover API which rolls the write alias to a new index based on supplied conditions. The command also adds a new index to read alias to make new data available for search. The following command rolls the alias over to a new index if the age of the current write index is older than 1 second. We use this short time interval just for demo purposes to make sure rollover takes place.

docker run -it --rm --net=host -e CONDITIONS='{"max_age": "1s"}' jaegertracing/jaeger-es-rollover:latest rollover  http://localhost:9200

Run the same command with -e ARCHIVE=true for the archive storage.

The next step is to remove old indices from read aliases. It means that old data will not be available for search. This imitates the behavior of --es.max-span-age flag used in default index-per-day deployment. This step could be optional and old indices could be simply removed by index cleaner in the next step.

docker run -it --rm --net=host -e UNIT=seconds -e UNIT_COUNT=1 jaegertracing/jaeger-es-rollover:latest lookback  http://localhost:9200

Run the same command with -e ARCHIVE=true for the archive storage.

The old data from Elasticsearch has to be periodically removed by deleting old indices. In this case we are running index cleaner with the parameter0 which removes all indices including the ones created today.

docker run -it --rm --net=host -e ROLLOVER=true jaegertracing/jaeger-es-index-cleaner:latest 0 http://localhost:9200

Run the same command with -e ARCHIVE=true for the archive storage.

Note that this functionality is broken at the moment and it will be available in the next release (1.13.1/1.14.0). The issue is tracked in #1681.

Rollover in Jaeger Operator

Rollover feature can be enabled in jaeger-operator be specifying --es.use-aliases=true in storage flags. The operator will automatically initialize the aliases and configure rollover cron jobs. The rollover conditions can be specified in theesRollover section under storage. Please refer to the operator documentation for more details.

Migration from daily indices to rollover

Migration from default index-per-day deployment is possible by manually adding old indices to read alias.

curl -ivX POST -H "Content-Type: application/json" localhost:9200/_aliases -d '{
"actions" : [
{ "add" : { "index" : "jaeger-span-*-*-*", "alias" : "jaeger-span-read" } },
{ "add" : { "index" : "jaeger-service-*-*-*", "alias" : "jaeger-service-read" } }
]
}'

And similarly for archive index:

curl -ivX POST -H "Content-Type: application/json" localhost:9200/_aliases -d '{
"actions" : [
{ "add" : { "index" : "jaeger-span-archive", "alias" : "jaeger-span-archive-read" } }
]
}'

Useful links

JaegerTracing

Open source distributed tracing platform at Cloud Native…

Thanks to Gary Brown

Pavol Loffay

Written by

Software engineer working in observability space. Working on Hypertrace, OpenTelemetry, Jaeger, OpenTracing, MicroProfile projects.

JaegerTracing

Open source distributed tracing platform at Cloud Native Computing Foundation (graduated). https://jaegertracing.io

Pavol Loffay

Written by

Software engineer working in observability space. Working on Hypertrace, OpenTelemetry, Jaeger, OpenTracing, MicroProfile projects.

JaegerTracing

Open source distributed tracing platform at Cloud Native Computing Foundation (graduated). https://jaegertracing.io

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store