Building a Load Balancer with LVS - Linux Virtual Server

In previous blogs I spent some time setting up load balancers using HAProxy, Pound and Nginx. What is common among them is that they all act as a Layer 7 reverse proxy. The common disadvantage of these technologies is that they are not very efficient at distributing Layer 4 traffic. They suffer from lots of context switching between user space and kernel space, which introduces delays, especially under heavy traffic with many short lived connections.
A better solution that runs entirely in kernel space is LVS [1]. Linux Virtual Server has been around since 1998, very mature and stable code that is compiled in the kernel, since 2.4.23 branch.
Layer 4 Switching determines the path of packets based on information available at layer 4 of the OSI 7 layer protocol stack. This means that the IP address and port are available as is the underlying protocol, TCP/IP or UDP/IP.
There are five  Forwarding Types in LVS - LVS-NAT, LVS-DR, LVS-Tun, LVS-FullNAT and LVS-SYNPROXY:
  • LVS-NAT as the name implies uses NAT from the Load Balancer (or the Director in LVS speak) to the back-end servers (or the Real Servers). The Director uses the ability of Linux kernel to change the network IP addresses and ports as packets pass through the kernel.There used to be a significant overhead when using this method, but not anymore. I'll demonstrate how to set this up later in this article.
  • LVS-DR stands for direct routing. The Director forwards all incoming requests to the nodes inside the cluster, but the nodes inside the cluster send their replies directly back to client computers. 
  • LVS-Tun uses IPIP tunneling. IP tunneling can be used to forward packets from one subnet or virtual LAN, to another subnet or VLAN, even when the packets must pass through another network or Internet. Building on the IP tunneling capability that is part of the Linux kernel, the LVS-TUN forwarding method allows you to place cluster nodes on a cluster network that is not on the same network segment as the Director. 
  • LVS-FullNAT, this is a relatively new module that introduces local ip address (IDC internal ip address, lip), IPVS translates cip-vip to/from lip-rip, in which lip and rip both are IDC internal ip address, so that LVS load balancer and real servers can be in different vlans, and real servers only need to access internal network.
  • LVS-SYNPROXY is based on tcp syncookies.
Please note that FullNAT and SYNPROXY have limited testing at the time of writing this article.

Now that we have the basics covered let's create a load balancers that listens on port 80, and distributes TCP connections in a round-robin fashion to two back-end nodes using NAT.

First let's install the user-space tools used to manage LVS:
Then let's describe the topology:
Line 1 adds a TCP virtual service on 192.168.122.53 port 80, using round-robin algorithm. This is your Director, or load balancer.
Lines 2 and 3 add two real servers (back end nodes, running Apache) to the virtual service specified on line 1.

To list the current configuration and the various stats, run:
To save the current configuration use:
To restore previously saved config execute:
To clean the current setup run:
To test the configuration just connect to the load balancer using curl or nc:
And that's all it takes to configure a TCP load balancer that distributes connections to two real servers listening on port 80.

One thing to keep in mind is that LVS does not know when a real server (back-end node) is down and it will still send traffic to it. LVS blindly forwards packets based on the configured rules and this is all it does. This of course is not very useful in production environments

To solve this problem we need some monitoring in place that will remove real servers from the LVS configuration if they are no longer able to accept connections.

There are many tools out there that do just that, but in this example I am going to use mon. I am not going to go into great details about how mon works, but in a nutshell it's a daemon that runs custom tests (in this case I'll use http test) and based on if the test fails or passes mon will execute a script that does something. It's extremely extendable and one can write its own monitoring or action scripts.

Let's first install it:


File: gistfile1.sh
------------------

[root@host1 ~]# yum install -y mon
The configuration file is in /etc/mon. Here's an example using the two real servers configured earlier: Lines 18 and 29 define a hostgroup, which consist of our real servers to be monitored.
Line 22 defines the interval the monitor should run.
Line 23 sets up the monitor type. You can see all monitors that come with the mon package in /usr/lib64/mon/mon.d/
Line 26 specifies what script to execute when the test fails.
Line 27 defines what script to run when the test succeeds after a failure. You can see all alert scripts that come with the mon package in /usr/lib64/mon/alert.d/

Let's create our own test.alert script that will add and remove real servers from LVS:


File: gistfile1.sh
------------------

[root@host1 ~]# vi /usr/lib64/mon/alert.d/test.alert 

#!/bin/sh
#
# $Id: test.alert,v 1.1.1.1 2004/06/09 05:18:07 trockij Exp $
#echo "`date` $*" >> /tmp/test.alert.log

if [ "$9" = "-u" ]
then
   echo "`date` Real Server $6 is UP" >> /tmp/test.alert.log
   ipvsadm -a -t 192.168.122.53:80 -r $6:80 -m
else
   echo "`date` Real Server $6 is DOWN" >> /tmp/test.alert.log
   ipvsadm -d -t 192.168.122.53:80 -r $6:80 
fi
With everything in place let's start the service: When Apache is no longer accessible on port 80 on the first real server mon will put the following message in /tmp/test.alert.log and remove the node from LVS. When Apache is accessible again (-u will be passed from mon to the test.alert script as argument at $9), the test.alert script will add the node back in LVS.

Resources:
[1] http://www.linuxvirtualserver.org/