Arkiv: Backup and archiving

Published in

Skriv Blog

3 min readAug 11, 2017

Everybody knows that data must be backed up and archived frequently. Data loss is no fun. And it’s obviously even more important in a professional context, when you are dealing with other people’s data.

As I am creating Skriv, I have to be sure that the platform will be trustable. That means three things: Uptime measure (detection of service outages), Service-Level Agreement (contractual dedication to have a running service) and Disaster Recovery Plan (be prepared to face a real problem). I will write later about the two first ones.

Planning against all kinds of disaster is a huge topic. For now, let’s focus on the backup and archiving part of it. To be able to rebuild a platform, you must have recent and usable backups of the data. Then comes the need for a good way to back up data and archive them in a safe place. Even the concept of “safe place” isn’t self-evident: When something bad happens, you’ll want to restore the most recent version of your data, so it has to be quickly available; but sometimes it is necessary to get a previous version, even if it takes more time to fetch it.

That’s the reason why I created Arkiv, a simple tool to manage backup and archiving. It is open-sourced and freely available on GitHub: https://github.com/Amaury/Arkiv

The key features are:

Backup files and MySQL databases.
Could be executed as often as you want (every two days, every day, 3 times a day, every hour, …).
Store backups locally, archive optionally on Amazon S3 and Amazon Glacier.
Purge data (locally and on Amazon S3) after configured delays.

You may know Amazon S3; it’s a well-known data storage service launched 11 years ago, used by a lot of web companies to store data. It’s like an unlimited hard disk drive where your data are just a few milliseconds away for a very reasonable price.
Amazon Glacier is a more recent service (3 years old). It’s also a data storage service, but designed for long-term archiving. Stored files are not available in real time (it takes 5 minutes to 12 hours to get a file) in exchange for a very cheap price.

A real use case could be something like that:

Files and databases are backed up every hour, stored locally and on Amazon S3 and Amazon Glacier.
All backups are kept locally for 3 days, then four backups per day are kept for the next 3 days, and then one backup per day for a week. Then backups are deleted.
All backups are kept on Amazon S3 for 2 weeks, then six backups per day are kept for the next 2 weeks, and then one backup per day for another month. Then backups are deleted.
Backups are stored on Amazon Glacier for ever.

Arkiv is written in plain Bash shell. It should be compatible with any Unix/Linux system.

The configuration process is done using a command-line tool. You just have to answer a list of questions (most of them pre-answered).
Here is an example of this process:

Log files are as user-friendly as possible. Here is an example:

I hope it will be useful to other people. Feel free to test it, and tell me if you use it.

Arkiv: Backup and archiving

Written by Amaury Bouchard