PostgreSQL Tutorial: Using pg_rewind to re-synchronize a demoted master

July 30, 2024

Summary: In this tutorial, we shall discuss how the old master that has diverged from the replication cluster can be added back to the cluster as a standby, without rebuilding it from scratch.

Table of Contents

Getting ready

This works for a replication cluster on which a failover has been performed. Additionally, it only works when either the parameter wal_log_hints or data_checksums is enabled. Setting both these parameters may add some performance penalty, however, with wal_log_hints incurring less overhead.

wal_log_hints: When this parameter is active, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint, even for non-critical modifications of so-called hint bits. Changing this parameter requires RESTART of the PostgreSQL cluster. The following must be performed on all the PostgreSQL servers in replication:

$ psql -c "ALTER SYSTEM SET wal_log_hints TO 'ON'"
$ pg_ctl -D $PGDATA restart

Additionally, all the WAL segments since failover must be available for the rewind to complete.

How to do it…

We will do this using the following steps:

1. Shut down PostgreSQL on the old master manually, if required:

$ pg_ctl -D $PGDATA stop

2. Run pg_rewind:

$ pg_rewind -D /var/lib/pgsql/12/data --sourceserver="host=<new_master_IP> port=5432 user=postgres dbname=postgres password=secret"
The servers diverged at WAL POSITION 0/9BC9E268 ON timeline 1.
could NOT OPEN file "/var/lib/pgsql/12/data/pg_wal/00000001000000000000009B": No such file OR directory

could NOT find previous WAL record at 0/9BC9E268
Failure, exiting

3. Here, there are only a few WAL segments left on the master, so now we need to copy some back:

$ ls -l /var/lib/pgsql/12/data/pg_wal/
-rw------- 1 postgres postgres 16777216 Apr  4 13:51 0000000100000000000000B5
-rw------- 1 postgres postgres 16777216 Apr  4 13:51 0000000100000000000000B6
-rw------- 1 postgres postgres       42 Apr  4 13:49 00000002.history
drwx------ 2 postgres postgres     4096 Apr  4 13:51 archive_status/

So, we just need to copy all the WAL segments from 9B to B5:

$ cp /backup/archive/00000001000000000000009[BCDEF] /var/lib/pgsql/12/data/pg_wal/
$ cp /backup/archive/0000000100000000000000A? /var/lib/pgsql/12/data/pg_wal/
$ cp /backup/archive/0000000100000000000000B[01234] /var/lib/pgsql/12/data/pg_wal/

4. Run pg_rewind again:

$ pg_rewind -D /var/lib/pgsql/12/data --sourceserver="host=<new_master_IP> port=5432 user=postgres dbname=postgres password=secret"
The servers diverged at WAL position 0/9BC9E268 on timeline 1.
Rewinding from last common checkpoint at 0/9B8B20D0 on timeline 1
Done!

5. Add the primary_conninfo setting and the standby.signal file:

$ echo "primary_conninfo = 'user=replicator password=secret host=<new_master_IP> port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'" >> $PGDATA/postgresql.auto.conf
$ echo "standby_mode = 'ON'" >> $PGDATA/postgresql.auto.conf
$ touch $PGDATA/standby.signal

6. Start the standby (M1):

$ pg_ctl -D $PGDATA start

How it works…

Once a failover has been performed, the timeline on the new master changes and is not the same as the old master. At this point, we could use pg_rewind to resynchronize the old master using the new master. This way, we could add the old master as a standby to the new master without rebuilding it. In order to achieve that, we make sure that the old master is not accepting writes and is shut down upon a failover. Otherwise, we could use the command used in Step 1 to manually shut down Postgres running on the old master (M1).

Once the old master is shut down, we could use pg_rewind as seen in Step 2 to rewind the master to the point from where the replication should start from the new master. In the command, the only change required is the host, which needs to be either the IP or the hostname of the new master. Additionally, you should be modifying the user and passwords appropriately.

The time taken to perform pg_rewind varies. It may be faster or slower depending on the changes that happened upon the failover. After finishing Step 2, we should have the appropriate parameters and files in a place that says that the old master is a standby. We could refer to Step 3, which tells how the two parameters primary_conninfo and standby_mode can be set in the postgresql.auto.conf file of the old master (M1). Upon adding the parameters, we could create an empty file, standby.signal, as seen in Step 3. And then, we could simply start the PostgreSQL cluster on server M1.