How to schedule maintenance tasks for self-hosted Elasticsearch/Opensearch cluster
If you run your own deployment of Elasticsearch/Opensearch, you might need to schedule maintenance scripts. Such as taking snapshot/backups of indexes, migrating to another Elasticsearch cluster, transferring Elasticsearch data to S3, importing data into Elasticsearch, deleting/moving/archeiving old indexes, reindexing, rebalancing uneven shard distribution, export report queries to S3, etc.
For this purpose, a small open-source tool Elasticsearch-Workspace might come handy. This article will demonstrate how to launch local Elasticsearch cluster with Kibana and Elasticsearch-Workspace, load test data, export Elasticsearch index to S3 and schedule periodic backups.
Create docker-compose.yaml
file to launch locally 3-node Elasticsearch cluster with Kibana
version: '2.2'
services:
es01:
image: docker.elastic.co/elasticsearch/elasticsearch:7.16.3
container_name: es01
environment:
- node.name=es01
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es02,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data01:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- elastices02:
image: docker.elastic.co/elasticsearch/elasticsearch:7.16.3
container_name: es02
environment:
- node.name=es02
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data02:/usr/share/elasticsearch/data
networks:
- elastices03:
image: docker.elastic.co/elasticsearch/elasticsearch:7.16.3
container_name: es03
environment:
- node.name=es03
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es02
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data03:/usr/share/elasticsearch/data
networks:
- elastickib01:
image: docker.elastic.co/kibana/kibana:7.16.3
container_name: kib01
ports:
- 5601:5601
environment:
ELASTICSEARCH_URL: http://es01:9200
ELASTICSEARCH_HOSTS: '["http://es01:9200","http://es02:9200","http://es03:9200"]'
networks:
- elastic
workspace:
image: alnoda/elasticsearch-workspace
container_name: workspace
ports:
- 8020-8030:8020-8030
networks:
- elasticvolumes:
data01:
driver: local
data02:
driver: local
data03:
driver: localnetworks:
elastic:
driver: bridge
and start it with docker-compose up
. Wait untill the cluster is fully ready, open Kibana on http://localhost:5601 and import all sample datasets.
Open workspace UI http://localhost:8020/ for quick access to all the workspace tools
Open browser-based terminal http://localhost:8026/, check cluster nodes and shards
vulcanizer --host es01 nodes
vulcanizer --host es01 shards
Use elasticdump to export index kibana_sample_data_ecommerce
(from eCommerce sample dataset) to S3. Change with your S3 key, secret and bucket name
elasticdump \
--s3AccessKeyId "${access_key_id}" \
--s3SecretAccessKey "${access_key_secret}" \
--input=http://es01:9200/kibana_sample_data_ecommerce \
--output "s3://${bucket_name}/kibana_sample_data_ecommerce.json"
Check that exported index appeared in your S3 bucket.
Let’s now schedule export on periodic basis. Open browser-based IDE http://localhost:8026/ and create file /home/project/export.sh
file with the script to export data to S3. Make it executable with chmod +x /home/project/export.sh
Open browser-based Scheduler http://localhost:8026/ (user/pass: admin/admin), and schedule script, for example weekly. Select category — “general”, plugin — “Shell Script”
Cronicle dashboard will show the log of executions.
Disclamer: I am the creator of the elasticsearch-workspace (and other workspaces in that repo). I use them for my own development, and happy to share with the community. I hope you find it useful!