Linux Administration: Building HA cluster with Pacemaker, Corosync and DRBD

If you want to setup a Highly Available Linux cluster, but for some reason do not want to use an "enterprise" solution like Red Hat Cluster, you might consider using Pacemaker, Corosync and DRBD [1], [2], [3].

Pacemaker is a cluster resource manager. It achieves maximum availability for your cluster services by detecting and recovering from node and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure - either Corosync or Heartbeat.

For the purpose of this blog, we'll use Corosync and setup a two node highly available Apache web server with an Active/Passive cluster using DRBD and Ext4 to store data.

To install the software we'll be using a Fedora repository:

File: gistfile1.sh ------------------ [root@node1 ~]# sed -i.bak "s/enabled=0/enabled=1/g" /etc/yum.repos.d/fedora.repo [root@node1 ~]# sed -i.bak "s/enabled=0/enabled=1/g" /etc/yum.repos.d/fedora-updates.repo [root@node1 ~]# yum install -y pacemaker corosync

To configure Corosync, we need to choose unused multicast address and a port:

File: gistfile1.sh ------------------ [root@node1 ~]# export ais_port=4000 [root@node1 ~]# export ais_mcast=226.94.1.1 [root@node1 ~]# export ais_addr=`ip addr | grep "inet " | tail -n 1 | awk '{print $4}' | sed s/255/0/` [root@node1 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf [root@node1 ~]# sed -i.bak "s/.*mcastaddr:.*/mcastaddr:\ $ais_mcast/g" /etc/corosync/ corosync.conf [root@node1 ~]# sed -i.bak "s/.*mcastport:.*/mcastport:\ $ais_port/g" /etc/corosync/corosync.conf [root@node1 ~]# sed -i.bak "s/.*bindnetaddr:.*/bindnetaddr:\ $ais_addr/g" /etc/corosync/corosync.conf

We also need to tell Corosync to load the Pacemaker plugin:

File: gistfile1.sh ------------------ [root@node1 ~]# cat <<-END >>/etc/corosync/service.d/pcmk service { # Load the Pacemaker Cluster Resource Manager name: pacemaker ver: 1 } END

At this point we need to propagate the configuration changes we made to the second node:

Now we can start Corosync on the first node and check /var/log/messages:

If all looks good we can start Corosync on the second node as well, and check if the cluster was formed by tailing /var/log/messages.

The next step is to start Pacemaker on both nodes:

To display the cluster status run:

Now that we have a working cluster make sure you get familiar with the main cluster administration tool:

Let's examine the current cluster configuration:

One thing to note is that Pacemaker ships with STONITH enabled. STONITH is a common node fencing mechanism that is used to ensure data integrity by powering off (or Shooting The Other Node In The Head) a problematic node.

For the purpose of this example let's simplify things and disable STONITH, at least for now:

We should also disable Quorum, since this is a two node cluster:

Now it's time to add the first shared resource - an IP address, because regardless of where the cluster service(s) are running, we need a consistent address to contact them on:

File: gistfile1.sh ------------------ [root@node1 ~]# crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip=192.168.122.101 cidr_netmask=32 op monitor interval=30s

The other important piece of information here is ocf:heartbeat:IPaddr2. This tells Pacemaker three things about the resource you want to add. The first field, ocf, is the standard to which the resource script conforms to and where to find it. The second field is specific to OCF resources and tells the cluster which namespace to find the resource script in, in this case heartbeat. The last field indicates the name of the resource script.

To obtain a list of the available resource classes, run

Let's test this by performing a fail-over. The IP should move from the first node it's currently being hosted on to the second - passive - node.

First let's check on what node the IP recourse is currently running:

On that node stop Pacemaker and Corosync, in that order:

Or put the node on stand-by:

Check the status of the cluster and observer where the IP resourse has moved:

You can also check with:

Now let's simulate node recovery by starting the services back in the following order:

Or put the node back online:

It's time to add more services to the cluster. Let's install Apache:

Create an index page on both nodes, displaying the name of the node:

In order to monitor the health of your Apache instance, and recover it if it fails, the resource agent used by Pacemaker assumes the server-status URL is available.

Look for the following in /etc/httpd/conf/httpd.conf and make sure it is not disabled or commented out:

File: gistfile1.sh ------------------ <location server-status=""> SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 </location>

At this point, Apache is ready to go, all that needs to be done is to add it to the cluster. Lets call the resource WebSite. We need to use an OCF script called apache in the heartbeat namespace, the only required parameter is the path to the main Apache configuration file and we’ll tell the cluster to check once a minute that apache is still running:

File: gistfile1.sh ------------------ [root@node1 ~]# crm configure primitive WebSite ocf:heartbeat:apache params configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min [root@node1 ~]# crm configure show

Pacemaker will generally try to spread the configured resources across the cluster nodes. In the case with Apache we need to tell the cluster that two resources are related and need to run on the same host (or not at all). Here we instruct the cluster that WebSite can only run on the host that ClusterIP is active on:

When Apache starts, it binds to the available IP addresses. It doesn’t know about any addresses we add afterwards, so not only do they need to run on the same node, but we need to make sure ClusterIP is already active before we start WebSite. We do this by adding an ordering constraint. We need to give it a name (choose something descriptive like apache-after-ip), indicate that its mandatory (so that any recovery for ClusterIP will also trigger recovery of WebSite) and list the two resources in the order we need them to start:

We can also specify a preferred location - node - on which the Apache server should run (if it's a better hardware for example):

To manually move a resource from one node to the other we need to run:

And to move it back:

Configuring DRBD as a cluster resource.

Think of DRBD as network based RAID-1. Instead of manually syncing data between nodes, we can use a block level replication to do it for us.

For more information on how to setup DRBD refer to [3] and [4].

Run the cluster configuration utility:

Next we must create a working copy or the current configuration. This is where all our changes will go. The cluster will not see any of them until we say its ok.

Now let's create the DRBD clone, display the revised configuration and commit the changes:

File: gistfile1.sh ------------------ crm(drbd)# configure primitive WebData ocf:linbit:drbd params drbd_resource=wwwdata op monitor interval=60s crm(drbd)# configure ms WebDataClone WebData meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true crm(drbd)# configure show crm(drbd)# cib commit drbd

Now that DRBD is functioning we can configure a Filesystem resource to use it. In addition to the filesystem’s definition, we also need to tell the cluster where it can be located (only on the DRBD Primary) and when it is allowed to start (after the Primary was promoted):

File: gistfile1.sh ------------------ [root@node1 ~]# crm crm(live)# cib new fs crm(fs)# configure primitive WebFS ocf:heartbeat:Filesystem params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4" crm(fs)# configure colocation fs_on_drbd inf: WebFS WebDataClone:Master crm(fs)# configure order WebFS-after-WebData inf: WebDataClone:promote WebFS:start

We also need to tell the cluster that Apache needs to run on the same machine as the filesystem and that it must be active before Apache can start and commit the changes:

File: gistfile1.sh ------------------ crm(fs)# configure colocation WebSite-with-WebFS inf: WebSite WebFS crm(fs)# configure order WebSite-after-WebFS inf: WebFS WebSite crm(fs)# cib commit fs

Now we have a fully functional two node HA solution for Apache!

You can easily setup HA Mysql or NFS using the same method.

For more detailed information please read the main tutorial at [5].

Resources:

[1] http://www.clusterlabs.org/

[2] http://www.corosync.org/

[3] http://www.drbd.org/

[4] http://kaivanov.blogspot.com/2012/01/deploying-drbd-on-linux.html

[5] http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/index.html

Pages

Building HA cluster with Pacemaker, Corosync and DRBD