Managing PostgreSQL backups and replication for very large databases — Part 2

leboncoin
leboncoin
Mar 7, 2019 · 8 min read

By Flavio Gurgel (Database Administrator)

Back to this subject after the first part of this serie, let’s put some practical examples on how we handle our backups at leboncoin.

Restore tests

Let’s assume that you have defined your backup strategy, frequency, retention policies and backup jobs are in place. Everything seems safe on a disk or tape somewhere.

An incident strikes hard and you just lost a database. You look at your handy backups and one of this may happen to you:

  • the person that wrote the backup job didn’t write a restore job

I won’t stress much more, everybody needs to test restore. It’s mandatory. It’s good for your health. It can be really life threatening not having tests. Restore tests should use exactly the same automatic job or written procedure that is available in case of incident. These exercises can be long, tedious or expensive. You need a spare server to make your tests.

More then the risk mitigation that you achieve with restore testing (which may be already enough to justify it), you can actually profit from restore tests:

  • restore on staging servers so you can test your ready for production code, schema migrations, etc, on fresh data

The frequency of restore testing depends on your needs and possibilities, a minimum viable option would be every time you update your database binary versions or change the database schema.

At leboncoin, the most complicated database restore is tested every week. The job is fully automatic. The restored database is used for staging and Quality Assurance purposes.

How dumps are taken at leboncoin

We have a “dump server” setup that can connect to all production databases and is authorized to read them. Daily, the server dumps all transactional databases in a scheduled fashion, encrypts them on the fly so we don’t have clear text data at rest, and the encrypted files are then stored on disk and sent to a cloud provider.

Having dumps on local disks and in the cloud is important because we have fast access do latest dumps (cached on disk) and offsite storage of the oldest (cloud) in the very rare case of major catastrophe like complete destruction of the two geographically separated datacenters.

This dump server is well protected from unauthorized access. This server is also capable to restore individual database dumps anywhere in the infrastructure.

Logical backup retention

Dumps are human-readable text (or may be easily transformed to), they are convenient for late auditing and can be restored even after several years in newest versions of PostgreSQL, or even in another database system with a few manual changes. So dumps are the way to go for long term retention. In leboncoin’s case, it’s defined by GDPR rules.

Every business situation and regulatory needs are different. Some examples:

  • banking transactions may need 5 years of dump retention for transaction auditing

Discuss your dump retention in your company to define your scenario. Decide you dump frequency and storage needs (disk, tape, optical media, cloud) for the short and long term needs.

How physical backups are taken at leboncoin

Physical backups are made with the barman tool. Physical backups need a lot of disk space but not too much CPU cycles.

To calculate the disk space needed for physical backups, we need to know:

  • the size of the database on disk

From all the numbers above, the database size on disk is the easiest to know. The number of basebackups to save is a decision to take with some factors like the estimated time to restore. To calculate this, the factors are based on how much WAL is generated. Databases that are very modified (all write operations, INSERT, DELETE, UPDATE) generate more WAL, and to restore all this WAL PostgreSQL takes some time. Let’s look at some examples:

  • Case 1 — A database of 100 GB that generates 10 GB of WAL per day, would take approximately the same time to restore the basebackup + 10 days of generated WAL.

With that in mind, at leboncoin, some databases have basebackups taken every day, others once a week, and we even have some very large databases with very few changes per day that are basebackup once every 15 days.

Be aware that it’s possible to basebackup more than once a day in a case of a very small database with a lot of changes. That can greatly reduce you time to restore.

Another interesting thing about physical backups is, since we have two datacenters and data is replicated from one to the other, we need to have backups on both to be geographically safe. Our barman servers work in pairs, one server has a peer at the other datacenter, both receive the complete WAL stream from the same origin (sometimes in a cascaded fashion) but the basebackups are taken in alternance.

A minimal two barman/two database servers strategy for geographically protected production environments — each barman receives the full WAL stream, even if cascaded, but basebackups are taken in alternate schedules to avoid double pressure on production. Note that this architecture can be different depending on network capacity and other production constraints.

Since we have several database servers in production, we have also several barman servers to cope with the disk space needs and CPU time, because while receiving WAL some CPU is consumed by the stream receiver (pg_receivewal under the hood for barman) and compression. Each barman pair is shared amongst a couple of database clusters.

Physical backups retention

Physical backups are very useful, if not only, for disaster recovery. It doesn’t make sense to have several months or years of basebackups plus WAL stream on disk.

Dumps are better and cheaper to store for the long term as we saw above.

Physical backup retention should be discussed as how to recover from a disaster as quickly as possible and how much time on the past we should be able to restore in case of Point in Time Recovery (possible only with physical backups, not dumps).

Some business cases of long term Point in Time Recovery needs exist. Fortunately, these cases are rare. But it’s doable, it only needs lots of disk/tape space to keep all basebackups and the corresponding WAL stream.

Usually, a couple of days or weeks of physical backup retention is enough. It will depend only on your business rules and disk space.

At a minimum, we’d say that two basebackups and corresponding WAL on disk is needed. This way, you can restore your database in case of incident while taking a new basebackup. The maximum will usually be your backup disk limits.

Avoid too slow storage for physical backups like tapes or cloud object storage. They can render your production too much time stopped in case of incident.

Estimating time to basebackup

With physical backups, the time to basebackup is usually a matter of hardware capacity concerning disk speed and network bandwidth, the slower will define the basebackup time:

  • disks capable of 10 MB/s transfer rate (including controller capacity) for a 10 GB database, basebackup will take around 17 minutes to finish.

What if a basebackup takes too much time to take?

It happens, deal with it.

We have some databases of several terabytes that will need a couple of days to basebackup. Just plan your disk capacity and network bandwidth for that and you should be ok. PostgreSQL keeps working normally while taking a basebackup.

If a very large database have few modifications and few WAL generated in contrast to the size of the database on disk, you’ll be fine to basebackup in large intervals.

Estimating time to restore

Similar to basebackup, the time to restore depends exactly the same on your hardware capacity, but the total time is higher then the basebackup because we have also to replay the WAL stream.

Caution: the calculation below is less accurate because WAL restoring need some processing and the type of writing is random, use this example to only as as a gross estimation.

  • 100 GB basebackup + 50 GB worth of WAL to restore, considering a 10 MB/s disk bandwidth and similar network capacity

Estimating time for pg_dump or restore from a dump

It’s very difficult, almost impossible, to have this estimated, because it all depends on type of dump (compression ration, parallelism) and database schema (lots of indexes can be fast to dump but very slow to restore).

You have only one chance: do it. Dump and test restore procedures on your real production databases. Try different configurations and benchmark yourself. Every database is different so you’ll only know if you test all of them.

Not finished yet — we need a third part

Why? Because we have to talk about replication. Replication has something to do with backups, because a physical backup is the start point of replication.

Ready? Stay tuned ;)

leboncoin Engineering Blog

Learn more about creative engineers & data scientists building a French virtual Flea Market