PostgreSQL Tutorial: Restoring a backup taken using pg_basebackup

October 16, 2023

Summary: In the backing up a database cluster tutorial, we discussed how pg_basebackup can be used to take a physical backup of the PostgreSQL cluster. We may restore this backup in various situations, such as creating a standby server in replication or recovering a database to a point in time. In this tutorial, we shall see the steps required to restore a backup taken using pg_basebackup.

Getting ready

In order to restore a backup taken using pg_basebackup, there should be a new directory that has available storage equivalent to the original data directory. Additionally, the backup should be copied over the network, if required, to the target server where it needs to be restored.

The backup needs to be extracted to the targeted data directory as a Postgres user. So, it is important to have access to the Postgres OS user to perform the restore of the database cluster.

How to do it

The following are the steps involved in restoring a backup to the /pgdata directory and starting PostgreSQL using the restored backup:

1. Create a directory to which the backup needs to be restored. This directory needs to be empty before we proceed to step 2:

$ sudo mkdir -p /pgdata

2. Extract the base.tar.gz file to the target directory:

$ tar xzf /backup_location/base.tar.gz -C /pgdata

3. If the database server contains one or more tablespaces, then the individual tablespaces should also be extracted to different directories. As is visible in the following log, we see a tar file (named with the oid of the tablespace) for each tablespace when a backup is completed:

$ ls -l /backup_dir/20201027/
total 16816
-rw-------. 1 postgres postgres 1006685 Oct 27 11:48 16575.tar.gz
-rw-------. 1 postgres postgres 1006657 Oct 27 11:48 16576.tar.gz
-rw-------. 1 postgres postgres 15183012 Oct 27 11:48 base.tar.gz
-rw-------. 1 postgres postgres 17094 Oct 27 11:48 pg_wal.tar.gz

A tablespace_map file exists in the data directory that is extracted using base.tar.gz after step 2:

$ cat /pgdata/tablespace_map
16575 /data_tblspc
16576 /index_tblspc

So, each of these tablespaces must be extracted to the directories mentioned in the tablespace_map file:

$ tar xzf 16575.tar.gz -C /data_tblspc
$ tar xzf 16576.tar.gz -C /index_tblspc

4. Extract the WAL segments generated during the backup to the pg_wal directory:

$ tar xzf pg_wal.tar.gz -C /pgdata/pg_wal

5. Start PostgreSQL using the data directory restored using the backup:

$ pg_ctl -D /pgdata start

How it works

In Step 1, we are creating a directory that needs to be considered a data directory or a target directory for the restore task.

And then proceed to Step 2 by extracting the base.tar.gz file to the target directory.

As seen in step 3, we use the tablespace mapping to extract the tablespace contents to the same locations. If the locations need to be modified, it can be done by just modifying the tablespace_map file manually and adding the new locations for each tablespace.

And we then proceed to step 4 to extract the WAL segments to the pg_wal directory of the target directory. Once all four steps are successfully completed, we can proceed to start the database cluster using the backup as seen in step 5.