Deploying Highly Available NFS Server with DRBD and Heartbeat on Debian

Here's a quick and dirty way of making NFS highly available by using DRBD for block level replication and Heartbeat as the messaging layer. Just as a side note - I would recommend that for larger setups you use Pacemaker and Corosync instead of Heartbeat, but for a simple two node NFS cluster this is more than sufficient.

First let's install NFS, DRBD and heartbeat. For this example I'll be using Debian Squeeze and DRBD version 8.3.

On both servers run:

I'll be using LVM with the following layout:

Make sure that your /etc/hosts contains the servers hostname

I'll be exporting /export:

The DRBD config file should look similar to the following (this config file works with drbd8-utils=2:8.3.7-2.1 that comes with Squeeze):

All we do in the config file is define a resource named r0 that contains both servers that will participate in the cluster server1 and server2 along with their IP's and the block devices underneath for the data and the metadata.

Initially we need to load the drbd kernel module (after that the drbd init script will do this) then initializes the meta data storage, attach a local backing block device to the DRBD resource's device and setup the network configuration on both servers:

One of the two servers is going to be the primary in a sense that the drbd block device will be mounted and used by NFS, and if it fails the second server will take over.

To promote the resource's device into primary role (you need to do this before any access to the device, such as creating or mounting a file system) and start the synchronization process run the following only on the primary server:

To check the process of the block level sync you can monitor /proc/drbd.

At this point you can create the file system on /dev/drbd0 mount it and copy your data.

To make sure that NFS has all of it's metadata (file locks, etc.) shared as well do the following on the first server:

And on the second server:

Time to install and configure heartbeat on both servers:

The 10.13.238.50 is the VIP that will be moved between both servers in the event of a failure. The rest of the options are pretty much self-explanatory.

To start drbd and heartbeat run:

At this point you should be able to mount the NFS export on your client using the VIP.

To simulate failure just stop heartbeat on the primary server and watch the IP address transition to the second server, the drdb unmounted and mounted on the second server.

4 comments:

  1. Nice write-up. Did you try to compare performance of such NFS over DRBD/HA with simple GlusterFS 2 - nodes replica?

    ReplyDelete
    Replies
    1. I actually ended up deploying glusterfs instead of NFS with DRBD and that worked even better. No more stale mounts issues, or the need to use 3 different pieces of software.

      Delete
  2. Did you end up using heartbeat with glusterfs? It looks like you have quite a bit of keepalived experience.

    I started researching using keepalived with drbd in order to provide shared storage for KVM disks, but it looks like I either need to use drbd with heartbeat or use glusterfs somehow.

    Thanks for the write up!

    ReplyDelete
    Replies
    1. I've been using glusterfs with keepalived to manage the vip in production without any issues. It's easy to setup and so far it's been working fine.

      Delete