PostgreSQL Tutorial: High Availability with Pacemaker and Corosync

November 5, 2024

Summary: In this tutorial, you will go through the process of setting up a high-availability cluster using Pacemaker and Corosync on Linux, ensuring that your services remain available with minimal downtime.

Table of Contents

Understanding high availability clustering

High availability is crucial for critical systems that should remain accessible even in the face of hardware or software failures. Pacemaker and Corosync are open-source tools that allow you to create a high-availability cluster on your Linux servers.

Before diving into the configuration, let’s briefly understand what high availability clustering is:

High availability clustering involves grouping multiple servers (nodes) together to provide redundancy for critical services. If one node fails, another takes over seamlessly, ensuring continuous service availability.

Setting up Pacemaker and Corosync

Step 1: Install Pacemaker and Corosync. On each node, install the Pacemaker and Corosync packages:

sudo apt update
sudo apt install pacemaker corosync

Step 2: Configure Corosync. Edit the Corosync configuration file on each node:

sudo nano /etc/corosync/corosync.conf

Here’s a basic configuration example for a two-node cluster:

totem {
    version: 2
    secauth: off
    cluster_name: my_cluster
    transport: udpu
}

nodelist {
    node {
        ring0_addr: node1_IP
        nodeid: 1
    }
    node {
        ring0_addr: node2_IP
        nodeid: 2
    }
}
quorum {
    provider: corosync_votequorum
}

Replace node1_IP and node2_IP with the actual IP addresses of your nodes.

Step 3: Start Corosync. Start the Corosync service on each node:

sudo systemctl start corosync

Step 4: Enable Corosync at Boot. Ensure Corosync starts automatically at boot:

sudo systemctl enable corosync

Configuring Pacemaker

Step 5: Start Pacemaker. Start the Pacemaker service on each node:

sudo systemctl start pacemaker

Step 6: Enable Pacemaker at Boot. Enable Pacemaker to start automatically at boot:

sudo systemctl enable pacemaker

Creating a virtual IP resource

Step 7: Create a Resource Agent. Pacemaker manages resources using resource agents. To create a simple resource agent for a virtual IP (VIP) address, create a file like vip.sh:

sudo nano /usr/local/bin/vip.sh

Add the following content:

#!/bin/bash
/sbin/ifconfig eth0:0 $1 netmask 255.255.255.0 up

and make the script executable:

sudo chmod +x /usr/local/bin/vip.sh

Step 8: Create a Resource Now, create a Pacemaker resource for the VIP. On one of the nodes, run:

sudo crm configure primitive vip ocf:heartbeat:IPaddr2 params ip="VIP_IP" nic="eth0" cidr_netmask="24" op monitor interval="10s"

Replace VIP_IP with the virtual IP address you want to use.

Step 9: Create a Resource Group. Create a resource group that includes the VIP resource:

sudo crm configure group vip_group vip

Testing failover

Step 10: Simulate Node Failure. To test the cluster, simulate a node failure by stopping the Corosync service on one of the nodes:

sudo systemctl stop corosync

Check the status of the cluster on the remaining node:

sudo crm status

You should see that the VIP has moved to the surviving node.

PostgreSQL configuration

Before adding a Pacemaker pgsql resource to manage the PostgreSQL services, you need to install PostgreSQL package and initialize the PostgreSQL database on each node.

After installation, configure the bind address the PostgreSQL will listen on. This needs to be set to * so the PostgreSQL service will listen on any address. PostgreSQL will scan for new addresses and automatically bind to them as they appear on the node. This is required to allow PostgreSQL to start listening on the VIP address in the event of node failover.

echo "listen_addresses = '*'" >> /db/pgsql/data/postgresql.conf

Additional configuration

To configure more resources, fencing, or complex constraints, refer to the Pacemaker documentation and tutorials. Pacemaker and Corosync offer a wide range of features for building highly available systems.

Congratulations! You’ve successfully set up a high-availability cluster on Linux using Pacemaker and Corosync. Your services are now resilient to node failures, providing uninterrupted availability for critical applications.