Backing up your Postgres Database

Gary Sieling
Linode Cube
Published in
4 min readAug 10, 2017

If you run a software application, backing up your database is crucial. Postgres provides several well-documented backup methods, supporting a variety of use cases. In this essay I discuss how and when to use each, including caveats to consider from my experience administering Postgres.

If you use Linode, you could conceivably rely on virtual machine backups for a simple, hassle-free option. However, if your business depends on the database, you should use one of the Postgres options so that the backups are stored in a separate data center.

Postgres backups are lightweight, and can be used to provision new machines — building a new database server is a good way to test the backup process. This will also give you a way to fail over to another data center in the event of an outage in the home data-center, and it will allow you to use your production data for testing purposes.

The simplest way to clone a Postgres database is to back up the filesystem. While easy, you must first shut down the database to run the backup, so it is primarily useful for transferring ownership of an application or moving a database to a larger virtual machine.

tar -cf backup.tar /usr/local/pgsql/data

While the database is in use, data will be continuously written to the file system. For a running system, you must use a backup method that handles database transactions. Postgres supports two options — you can choose to get data as it was at the start of the backup, or to continuously copy new transactions to your backup as they occur.

To use the first option, use the pg_dump utility, which creates a SQL file that reconstructs your database (the second option is fairly complex, and beyond the scope of this article)

PGUSER=postgres PGPASSWORD=postgres pg_dump > $(echo backup-`date +%Y-%m-%d-%H-%M`.sql)

A nice feature of this technique is that it produces searchable text files, which can be great for developers who want to audit the database contents.

To restore this backup, do this:

PGUSER=postgres PGPASSWORD=postgres psql < backup-2017–07–24–01–17.sql

I’ve include a timestamp in the backup file name, so that you’ll know which file is most recent when it comes time to do a restore.

If your database is large, you may wish to run the restore process in parallel, to save some time. This will switch the file format to a binary format, which prevents editing and searching the backup files.

PGUSER=postgres PGPASSWORD=postgres pg_dump -j 4 -Fd -f db-`date +%Y-%m-%d-%H-%M`

To restore this backup, do this:

PGUSER=postgres PGPASSWORD=postgres pg_restore db-2017–07–24–01–22

Note that for performance, all of these commands will run in the shortest time if you connect directly to the Postgres server, or if that isn’t possible, from another machine in the same datacenter. Otherwise, you will spend significant time waiting for data to transfer over a network.

As a final option, if you are doing database backups to support an automated test suite, there is a fantastic Python utility called stellar. This is not a backup solution at all — it simply makes a snapshot of your database within Postgres, so that you can revert instantly (internally, it does an ‘ALTER DATABASE RENAME…’ which is nearly instantaneous).

Once you choose a backup method, you’ll want to consider backup frequency, testing the data restoration, and storing backups securely. If you are using Postgres to support a web-based tool with built-in database backup functionality, you should be careful about where these are stored — often web tools store these files where they can be downloaded from the web.

If you run a database, it’s important to regularly test that your backups work — without this, the entire effort is wasted. Once you set this up for the first time, you should create a calendar reminder to test your backups. Depending on how much data you can afford to lose, you may wish to test quarterly, weekly, or even an automated process that runs nightly.

Please feel free to share below any comments, questions or insights about your experience with backing up a Postgres database. And if you found this blog useful, consider sharing it through social media.

Gary Sieling is a Software Architect in Philadelphia, Pa. He runs a search engine for conference talks and historic speeches, called https://www.findlectures.com. You can find him on Twitter at Gary Sieling.

--

--