HBase — Incremental backup and restore strategy
Introduction to the problem
If you have worked with HBase, you might have stumbled across a problem of backing up your data incrementally and recovering the data in the event of disaster. The enterprise versions of HBase provided by vendors like IBM, Cloudera, Hortonworks, etc comes with an incremental backup feature but also has an enterprise cost attached to it. The regular versions of HBase misses these features, except for version 3.0 which is still in its alpha release. We, at Myntra, follow a methodical strategy to incrementally backup our data and recover the data incase of catastrophic failure. The strategy helps us to achieve the following:
- Avoid running out of disk space for our HBase cluster as the data grows.
- Backup the data incrementally in our cloud storage.
- Restore the data depending on the time granularity of our choice.
The incremental backup and restore strategy explained below has been carried out for Apache HBase 2.1.4 and Hadoop 2.7.3.
Incremental backup
We keep data of the past 30 days in our cluster. The older data are backed-up in a cloud data storage, compressed and ready to be restored on demand. The ‘time to live’ for each record is 30 days and any record older than 30 days will perish automatically.
> create ‘my_table’, {‘NAME’ => ‘my_col_fam’,’TTL’ => 2592000}
Every day, a backup job (cron) runs against yesterday’s data. The job executes the following steps:
1. Chunking and Exporting
Millions of rows are stored every day. Trying to back-up the data for the whole day, poses a high chance of getting a timeout error. Restarting the process would be expensive, also adding to uncertainty of its success. As a remedy, we chunk a day’s data into an hour of granularity. (Note: We are using an hour as a granularity for data backup, we can use any unit of time with any quantity). We start backing-up prior day’s data of the first hour (00:00:00 to 01:00:00 where the time format is in HH:MM:SS), then the second hour, the third and so on. For ease of understanding, let us refer to every hour’s data as a ‘chunk’. The following steps are carried out for each chunk. A chunk of data is copied to a temporary location in HDFS.
> hbase org.apache.hadoop.hbase.coprocessor.Export ‘my_table’ hdfs://hbase_master_ip:9000/backup/2021–01–02–07–00–00TO2021–01–02–08–00–00 5 1609551000000 16095546000000
We are backing up the data between two time ranges and storing them in the HDFS directory. The first parameter is the table name, the second parameter is the HDFS directory for storage, third parameter is the version, fourth and fifth is the time range in epoch for which the data has to be backed up.
2. Hadoop to Local
The exported data is copied over to the local directory in a machine.
> hadoop fs -copyToLocal hdfs://hbase_master_ip:9000/backup/2021–01–02–07–00–00TO2021–01–02–08–00–00 /tmp
3. Compression
The chunk is compressed. It is recommended to compress the data for efficient use of your blob storage.
> cd /tmp && tar -zcvf 2021–01–02–07–00–00TO2021–01–02–08–00–00 2021–01–02–07–00–00TO2021–01–02–08–00–00.tar.gz
4. Upload
The compressed chunk is then uploaded to the blob storage. We have the liberty to choose any blob storage of our choice. Hence, the process of uploading would be left to the user. For now, let us keep the storage location of the chunk as:
/my-team-hbase-dumps/my-table/2021–01–02/2021–01–02–07–00–00TO2021–01–02–08–00–00.tar.gz
5. Clean
The intention of any back-up and strategy is to avoid disk exhaustion which would be of no significance if we skip to clean these temporary or intermediate files that are generated in the above steps.
> fs -rm -R hdfs://hdfs_master_ip:9000/backup/2021–01–02–07–00–00TO2021–01–02–08–00–00rm -rf /tmp/2021–01–02–07–00–00TO2021–01–02–08–00–00.tar.gz
> rm -rf /tmp/2021–01–02–07–00–00TO2021–01–02–08–00–00
On-demand restore
In short, restore is just a back-pedal of the back-up process! We take a chunk from our cloud storage and restore them in our HBase table. The restore flow is rare and is usually executed on demand. Let’s quickly understand the flow:
1. Download
The chunk is downloaded from the blob store to a local machine.
> cd /tmp && wget {cloud_storage_end_point}/my-team-hbase-dumps/my-table/2021–01–02/2021–01–02–07–00–00TO2021–01–02–08–00–00.tar.gz
2. Un-compression
The downloaded chunk is now uncompressed
> cd /tmp && tar -zxvf 2021–01–02–07–00–00TO2021–01–02–08–00–00 2021–01–02–07–00–00TO2021–01–02–08–00–00.tar.gz
3. Hadoop to Local
The chunk is copied from the local directory to a HDFS directory.
> hadoop fs -copyFromLocal /tmp/2021–01–02–07–00–00TO2021–01–02–08–00–00 hdfs://hdfs_master_ip:9000/backup
4. Import
The chunk is eventually ingested in the HBase table.
> hbase org.apache.hadoop.hbase.mapreduce.Import ‘my-table’ hdfs://hdfs_master_ip:9000/backup/2021–01–02–07–00–00TO2021–01–02–08–00–00
5. Clean
The temporary files are cleaned.
> hadoop fs -rm -R hdfs://hdfs_master_ip:9000/backup/2021–01–02–07–00–00TO2021–01–02–08–00–00rm -rf /tmp/2021–01–02–07–00–00TO2021–01–02–08–00–00.tar.gz
Prerequisites
For the above back-up and restore process to run smoothly in the Hadoop and HBase ecosystem, there are several configurations the cluster is expected to have. Some clusters might already have these configurations, while the others need to add them based on their setup.
1. Setting up Running YARN (Yet Another Resource Configuration)
The step of ‘import’ needs a Yarn to be set up in a cluster. A yarn-site.xml can have the following configuration which needs to be set in all the master and slave nodes of HDFS. All these nodes need to have the same configuration.
File: A yarn-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop_master_ip</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>9216</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>27648</value>
<description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>27648</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>9216</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx7372m</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>9216</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx7372m</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>9216</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx7372m</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>3686</value>
</property>
</configuration>
File: mapred-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
2. Setting up co-processor for Export
For the above-mentioned export command to run, the HBase must have co-processor classes configured in its hbase-site.xml in the master node.
File: hbase-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.coprocessor.Export</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>2400000</value>
</property>
<property>
<name>hbase.zookeeper.property.tickTime</name>
<value>120000</value>
</property>
<property>
<name>hbase.rpc.timeout</name>
<value>2400000</value>
</property>
<property>
<name>hbase.client.operation.timeout</name>
<value>2400000</value>
</property>
<property>
<name>hbase.rpc.read.timeout</name>
<value>2400000</value>
</property>
<property>
<name>hbase.rpc.write.timeout</name>
<value>2400000</value>
</property>
<property>
<name>hbase.client.meta.operation.timeout</name>
<value>2400000</value>
</property>
<property>
<name>hbase.client.scanner.timeout.period</name>
<value>2400000</value>
</property>
</configuration>
A higher value for ‘hbase.zookeeper.property.tickTime’ is kept. The reason being, for exporting huge data we take longer time to complete. If the time exceeds ‘zookeeper.session.timeout’, then the process becomes incomplete and export ends up failing. To avoid this we have to keep the value of ‘zookeeper.session.timeout’ comparatively higher than usual. However, we cannot keep any value of our choice. The minimum value of ‘zookeeper.session.timeout’ cannot be lesser than 2X of ‘hbase.zookeeper.property.tickTime’ and maximum of ‘zookeeper.session.timeout’’ cannot be higher than 20X of ‘hbase.zookeeper.property.tickTime’.
Verdict
Our HBase cluster eventually stopped running out of space, was stable with everyday’s data backed-up securely in the cloud and ready to be reinstated when needed! Your cluster might still need some tweak to make things run as per your expectations. It is a challenge to scale and tune any distributed systems but isn’t that what every passionate engineer longs for… a challenge?