Deploying GlusterFS

GlusterFS is a powerful network/cluster filesystem running in user space that uses FUSE to connect with the VFS layer in the OS. GlusterFS is a File System itself that uses already tried and tested disk file systems like ext3, ext4, xfs, etc. to store the data. It can easily scale up to petabytes of storage which is available to user under a single mount point [1].

In this tutorial I'll walk you through installing and configuring a GlusterFS storage pool consisting of 4 servers. Each server will have one brick exported and the entire storage pool (device) will consist of those 4 bricks.

Terms.

brick - A locally attached filesystem (e.g. xfs on top of LVM) that is part of a volume.
client - The machine which mounts the volume.
server - The machine which hosts the filesystem in which data will be stored.
volume - A Network available filesystem which can be mounted using native GlusterFS, NFS, CIFS, etc.

Installing and starting GlusterFS.
 
1. Download the package for your distribution from http://download.gluster.org/pub/gluster/glusterfs/LATEST/
2. Install GlusterFS on all servers using the following commands:

On RHEL/CentOS:


On Debian/Ubuntu:


3. Start the GlusterFS daemon on all servers:


Preparing Bricks.
 
1. On each server create a LVM volume with XFS filesystem on top and mount it. It's important to mention that all mount points must be unique throughout the entire storage pool.


Creating a trusted storage pool.
 
A trusted storage pool consists of the storage servers that will comprise the volume, in other words it is a trusted network of storage servers. When you start the first server, the storage pool consists of that server alone. To add additional storage servers to the storage pool, you can use the probe command from a storage server that is already trusted. The GlusterFS service must be running on all storage servers that you want to add to the storage pool.

In our case to create a trusted storage pool of four servers, add three servers to the storage pool from server gnode1:

1. Probe the servers you want to add to the storage pool.


Note: Do not self-probe local host - gnode1.

Verify the peer status from the first server using the following commands:


Now that we have a trusted storage pool consisting of 4 servers lets create the Volumes that we can actually use to store files later on.

Configuring GlusterFS Volumes.
 
There are three types of volumes:
Distributed - Distributes files throughout the cluster.
Replicated - Replicates data across two or more nodes in the cluster.
Striped - Stripes files across multiple nodes in the cluster.

I'll demonstrate how to setup and use all three of them in the following sections.

Configuring GlusterFS Distributed Volumes.
 
Distributed volumes distribute files throughout the cluster. You can use distributed volumes to scale storage in an archival environment in situations where small periods of down time is acceptable during disk swaps.
Keep in mind that a disk failure in distributed volumes can result in a serious loss of data since directory contents are spread randomly throughout the cluster.

To configure a distributed volume perform the following on only one server, in this case gnode1:

1. Create the volume using the following command:


You can optionally display the volume information using the following command:


2. Start the volume using the following command:


Configuring GlusterFS Replicated Volumes.
 
Distributed replicated volumes replicate (mirror) data across two or more nodes in the cluster. You can use distributed replicated volumes in environments where high-availability and high-reliability are critical. Distributed replicated volumes also offer improved read performance in most environments.

To configure a four node replicated volume with a two-way mirror perform the following on only one server, in this case gnode1:

1. Create the volume using the following command:


2. Start the volume using the following command:


Configuring GlusterFS Striped Volumes.
 
Distributed striped volumes stripe data across two or more nodes in the cluster. For best results, you should use distributed striped volumes only in high concurrency environments accessing very large files.

To configure a four node striped volume perform the following on only one server, in this case gnode1:

1. Create the volume using the following command:


2. Start the volume using the following command:


Using the Volumes.
 
Now that we have an available volume, comprising of four bricks on four servers, let's mount it on a client machine using native GlusterFS.
The Gluster Native Client is a FUSE-based client running in user space and it is the recommended method for accessing volumes if all the clustered features of GlusterFS has to be utilized.

On RedHat-based Distributions run:


On Debian-based Distributions run:


Ready to mount the volume:


It's worth mentioning that any one peer can be referenced as the mount source, in this case I chose gnode1.

We can also use NFS to export the volume from any server in the pool and then mount that export:


Expanding Volumes
 
You can expand volumes, as needed, while the cluster is online and available. For example, you might want to add a brick to a distributed volume, thereby increasing the distribution and adding to the capacity of the GlusterFS volume.
Similarly, you might want to add a group of bricks to a distributed replicated volume, increasing the capacity of the GlusterFS volume.
When expanding distributed replicated and distributed striped volumes, you need to add a number of bricks that is a multiple of the replica or stripe count. For example, to expand a distributed replicated volume with a replica count of 2, you need to add bricks in multiples of 2 (such as 4, 6, 8, etc.).

To expand a volume perform the following:

1. On the first server in the cluster, probe the server to which you want to add the new brick using the following command:


2. Add the brick using the following command:


3. Check the volume information using the following command:


4. Re-balance the volume to ensure that all files are visible at the mount point:


Shrinking Volumes
 
You can shrink volumes, as needed, while the cluster is online and available. For example, you might need to remove a brick that has become inaccessible in a distributed volume due to hardware or network failure.
Data residing on the brick that you are removing will no longer be accessible at the Gluster mount point. Note however that only the configuration information is removed - you can continue to access the data directly from the brick, as necessary.

To shrink a volume perform the following:

1. Remove the brick using the following command:


2. Check the volume information using the following command:


3. Re-balance the volume to ensure that all files are visible at the mount point.


Migrating Volumes
 
You can migrate the data from one brick to another, as needed, while the cluster is online and available.

To migrate the data in gnode3:/mnt/bricks/gnode3a to gnode4:/mnt/bricks/gnode4a in gvola:


Note: You need to have the FUSE package installed on the server on which you are running the replace-brick command for the command to work.

To pause the data migration run:


To abort the data migration run:


To check the data migration status execute:


The status command shows the current file being migrated along with the current total number of files migrated. After completion of migration, it displays Migration complete.

To commit the data migration:


The commit command completes the migration of data to the new brick.

Resources:

[1] http://www.gluster.org/community/documentation/