Arkiv: Backup and archiving

Amaury Bouchard
Skriv Blog
Published in
3 min readAug 11, 2017

Everybody knows that data must be backed up and archived frequently. Data loss is no fun. And it’s obviously even more important in a professional context, when you are dealing with other people’s data.

As I am creating Skriv, I have to be sure that the platform will be trustable. That means three things: Uptime measure (detection of service outages), Service-Level Agreement (contractual dedication to have a running service) and Disaster Recovery Plan (be prepared to face a real problem). I will write later about the two first ones.

Planning against all kinds of disaster is a huge topic. For now, let’s focus on the backup and archiving part of it. To be able to rebuild a platform, you must have recent and usable backups of the data. Then comes the need for a good way to back up data and archive them in a safe place. Even the concept of “safe place” isn’t self-evident: When something bad happens, you’ll want to restore the most recent version of your data, so it has to be quickly available; but sometimes it is necessary to get a previous version, even if it takes more time to fetch it.

That’s the reason why I created Arkiv, a simple tool to manage backup and archiving. It is open-sourced and freely available on GitHub: https://github.com/Amaury/Arkiv

The key features are:

  • Backup files and MySQL databases.
  • Could be executed as often as you want (every two days, every day, 3 times a day, every hour, …).
  • Store backups locally, archive optionally on Amazon S3 and Amazon Glacier.
  • Purge data (locally and on Amazon S3) after configured delays.

You may know Amazon S3; it’s a well-known data storage service launched 11 years ago, used by a lot of web companies to store data. It’s like an unlimited hard disk drive where your data are just a few milliseconds away for a very reasonable price.
Amazon Glacier is a more recent service (3 years old). It’s also a data storage service, but designed for long-term archiving. Stored files are not available in real time (it takes 5 minutes to 12 hours to get a file) in exchange for a very cheap price.

A real use case could be something like that:

  • Files and databases are backed up every hour, stored locally and on Amazon S3 and Amazon Glacier.
  • All backups are kept locally for 3 days, then four backups per day are kept for the next 3 days, and then one backup per day for a week. Then backups are deleted.
  • All backups are kept on Amazon S3 for 2 weeks, then six backups per day are kept for the next 2 weeks, and then one backup per day for another month. Then backups are deleted.
  • Backups are stored on Amazon Glacier for ever.

Arkiv is written in plain Bash shell. It should be compatible with any Unix/Linux system.

The configuration process is done using a command-line tool. You just have to answer a list of questions (most of them pre-answered).
Here is an example of this process:

Click to enlarge

Log files are as user-friendly as possible. Here is an example:

Click to enlarge

I hope it will be useful to other people. Feel free to test it, and tell me if you use it.

--

--