November 5, 2024
Summary: In this tutorial, you will go through the process of setting up a high-availability cluster using Pacemaker and Corosync on Linux, ensuring that your services remain available with minimal downtime.
Table of Contents
Understanding high availability clustering
High availability is crucial for critical systems that should remain accessible even in the face of hardware or software failures. Pacemaker and Corosync are open-source tools that allow you to create a high-availability cluster on your Linux servers.
Before diving into the configuration, let’s briefly understand what high availability clustering is:
High availability clustering involves grouping multiple servers (nodes) together to provide redundancy for critical services. If one node fails, another takes over seamlessly, ensuring continuous service availability.
Setting up Pacemaker and Corosync
Step 1: Install Pacemaker and Corosync. On each node, install the Pacemaker and Corosync packages:
sudo apt update
sudo apt install pacemaker corosync
Step 2: Configure Corosync. Edit the Corosync configuration file on each node:
sudo nano /etc/corosync/corosync.conf
Here’s a basic configuration example for a two-node cluster:
totem {
version: 2
secauth: off
cluster_name: my_cluster
transport: udpu
}
nodelist {
node {
ring0_addr: node1_IP
nodeid: 1
}
node {
ring0_addr: node2_IP
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
}
Replace node1_IP
and node2_IP
with the actual IP addresses of your nodes.
Step 3: Start Corosync. Start the Corosync service on each node:
sudo systemctl start corosync
Step 4: Enable Corosync at Boot. Ensure Corosync starts automatically at boot:
sudo systemctl enable corosync
Configuring Pacemaker
Step 5: Start Pacemaker. Start the Pacemaker service on each node:
sudo systemctl start pacemaker
Step 6: Enable Pacemaker at Boot. Enable Pacemaker to start automatically at boot:
sudo systemctl enable pacemaker
Creating a virtual IP resource
Step 7: Create a Resource Agent. Pacemaker manages resources using resource agents. To create a simple resource agent for a virtual IP (VIP) address, create a file like vip.sh
:
sudo nano /usr/local/bin/vip.sh
Add the following content:
#!/bin/bash
/sbin/ifconfig eth0:0 $1 netmask 255.255.255.0 up
and make the script executable:
sudo chmod +x /usr/local/bin/vip.sh
Step 8: Create a Resource Now, create a Pacemaker resource for the VIP. On one of the nodes, run:
sudo crm configure primitive vip ocf:heartbeat:IPaddr2 params ip="VIP_IP" nic="eth0" cidr_netmask="24" op monitor interval="10s"
Replace VIP_IP
with the virtual IP address you want to use.
Step 9: Create a Resource Group. Create a resource group that includes the VIP resource:
sudo crm configure group vip_group vip
Testing failover
Step 10: Simulate Node Failure. To test the cluster, simulate a node failure by stopping the Corosync service on one of the nodes:
sudo systemctl stop corosync
Check the status of the cluster on the remaining node:
sudo crm status
You should see that the VIP has moved to the surviving node.
PostgreSQL configuration
Before adding a Pacemaker pgsql resource to manage the PostgreSQL services, you need to install PostgreSQL package and initialize the PostgreSQL database on each node.
After installation, configure the bind address the PostgreSQL will listen on. This needs to be set to * so the PostgreSQL service will listen on any address. PostgreSQL will scan for new addresses and automatically bind to them as they appear on the node. This is required to allow PostgreSQL to start listening on the VIP address in the event of node failover.
echo "listen_addresses = '*'" >> /db/pgsql/data/postgresql.conf
Additional configuration
To configure more resources, fencing, or complex constraints, refer to the Pacemaker documentation and tutorials. Pacemaker and Corosync offer a wide range of features for building highly available systems.
Congratulations! You’ve successfully set up a high-availability cluster on Linux using Pacemaker and Corosync. Your services are now resilient to node failures, providing uninterrupted availability for critical applications.