3. LVS: Install, Configure, Setup

3.1. Installing from Source Code

Doing this from source code is now described in the LVS-mini-HOWTO. Two methods of setup are described

  • Setup from the command line. This is fine to understand what's going on, and if you only want to have a single type of setup. For LVSs which you're reconfiguring a lot, it's tedious and mistake prone. If it doesn't work, you will spend some time figuring out why.
  • From a configure script which sets up an LVS with a single director. This script is fine for initial setups: it's mistake proof (will give you enough information about failures to figure out what might be wrong) and I used it for all my testing of LVS. Since it's not easily expandable to handle director failover and other configuration tools handle this now, the configure script is not being developed anymore. For production, where you need failover directors, you should use other setup tools or save your hand-built setup as a script (e.g. with ipvsadm-sav).

3.2. Ultra Monkey

Ultra Monkey is a packaged set of binaries for LVS, including Linux-HA for director failover and ldirectord for realserver failover. It's written by Horms, one of the LVS developers. Ultra Monkey was used on many of the server setups sold by VA Linux and presumably made lots of money for them. Ultra Monkey has been around since 2000 and is mature and stable. Questions about Ultra Monkey are answered on the LVS mailing list. Ultra Monkey is mentioned in many places in the LVS-HOWTO.

Ben Hollingsworth ben (dot) hollingsworth (at) bryanlgh (dot) org 29 Jun 2007

There's step by step instructions on How to install Ultra Monkey LVS in a 2-Node HA/LB Setup on CentOS/RHEL4 (http://www.jedi.com/obiwan/technology/ultramonkey-rhel4.html).

Dan Thagard daniel (at) gehringgroup (dot) com 3 Jul 2007

I recently setup LVS using the Ultramonkey RPMs. The following is a (based on my understanding) complete howto for setting up CentOS 5 with LVS: Generic CentOS 5 x64 Install on 2 PCs using Ultramonkey and Streamlined/HA topology with Apache The following assumptions were made:

  • Real Server names are ws01.testlab.local and ws02.testlab.local (replace these with the result from uname -n from each RS)
  • Real Server IPs are 10.0.0.10/24 and 10.0.0.20/24,
  • Gateway: 10.0.0.1
  • Virtual IP: 10.0.0.100
  • Username: tester
  1. Power PC and insert CD during BIOS.
  2. Boot to CD.
  3. Hit 'Enter' for Graphical Installer.
  4. You will be prompted to test the installation media. You may choose to test the media or skip the test (usually you can skip this step).
  5. Click 'Next' to begin installation.
  6. Select 'English' as installation language and click 'Next'.
  7. Select 'U.S. English' as the keyboard configuration and click 'Next'.
  8. Select 'Remove all partitions on selected drivers and create default layout' and click 'Next'.
  9. Configure the network settings for each adapter.

    • a. Click 'Edit'.
    • i. Uncheck Configure using DHCP
    • ii. Input the IP Address and Netmask.
    • iii. Click 'OK'.
    • b. Input the Gateway and DNS and click 'Next'.
  10. Select 'America/ New York' and click 'Next'.
  11. Enter the root password twice and click 'Next'.
  12. Select the system packages

    .
    • a. Check 'Desktop-Gnome', 'Server', 'Server-GUI', 'Clustering', 'Storage Clustering'
    • b. Select 'Customize Now'
    • c. Click 'Next'.
  13. Configure the system packages.

    • a. Expand and click 'Details' on Desktop Environments->GNOME Desktop Environment.
    • i. Uncheck 'desktop-printing', 'dvd+rw tools', 'esc', 'gimp-print-utils', 'gnome-audio', 'gnome-backgrounds', 'gnome-mag', 'gnome-pilot', 'gnome-themes', 'gok', and 'nautilus-cd'
    • b. Expand Servers.
    • i. Uncheck 'DNS', 'Legacy Network Server', 'Mail Server', 'Network Servers', 'News', and 'Printing Support'
    • c. Expand Base System.
    • i. Uncheck 'Dialup Networking Support'
    • d. Expand and click 'Details' on Base System->Base.
    • i. Uncheck 'bluez-utils' and 'ccid'
    • e. Click 'Next'
  14. Click 'Next' to begin copying over the files.
  15. Remove DVD and click 'Reboot' to reboot the machine after installation.
  16. Set firewall to 'Disabled' and click 'Forward'.

    • Click 'Yes' on pop-up.
  17. Set SELinux to 'Disabled' and click 'Forward'.
  18. Select the 'Network Time Protocol' tab, check 'Enable Network Time Protocol', and click 'Forward'.
  19. Enter tester in the username field, 'Test User' in the Full name field, type in the password twice, and click 'Forward'.
  20. Click 'Forward' to skip the audio test.
  21. Click 'Finish' to complete the installation routine.
  22. Login to the local system using the root username and password.
  23. Edit the '/etc/group' file

    vi /etc/group
    
    • a. Locate the user 'tester' and append 'wheel' (i to insert, [ESC] to stop editing).
    • b. Save the file and exit by typing ':wq'.
  24. Leave the server, goto your PC and SSH into the server (e.g. PuTTY)
  25. Login as user 'tester'
  26. Su to root

    su -
    
  27. Install the dries yum repository by creating dries.repo in the /etc/yum.repo.d/ directory with the following contents

    [/etc/yum.repo.d/dries.repo]
    [dries]
    name=Extra Fedora rpms dries - $releasever \
    - $basearch baseurl=http://ftp.belnet.be/packages/dries.ulyssis.org/redhat/el5/en/x86_64/dries/RPMS
    
  28. Install the dries GPG key

    rpm --import http://dries.ulyssis.org/rpm/RPM-GPG-KEY.dries.txt
    
  29. Update your local packages and install some additional ones

    yum update -y && yum -y install lynx libawt xorg-x11-deprecated-libs nx freenx arptables_jf httpd-devel
    
  30. Correct release version

    mv /etc/redhat-release /etc/redhat-release.orig && \
    echo "Red Hat Enterprise Linux Server release 5 (Tikanga)" > /etc/redhat-release
    
  31. Download the Ultramonkey RPMs from http://www.ultramonkey.org (also grab perl-MAIL-POP3Client, available from http://rpm.pbone.net/index.php3/stat/4/idpl/4508518/com/perl-Mail-POP3Client-2.17-1.el5.centos.noarch.rpm.html as of the time of this writing)
  32. Install the arptables-noarp-addr and perl-Mail-POP3Client RPMs (change the cd path to wherever you downloaded Ultramonkey to)

    cd /usr/local/src/Ultramonkey && rpm -Uvh arptables-noarp-addr-0.99.2-1.rh.el.um.1.noarch.rpm && \
    rpm -Uvh perl-Mail-POP3Client-2.17-1.el5.centos.noarch.rpm
    
  33. Install Ultramonkey

    yum install -y heartbeat*
    
  34. Download and edit the Ultramonkey config files that relate your desired topology from http://www.ultramonkey.org to the /etc/ha.d/ directory and edit them to meet your desired configuration. Examples as follows:

    [/etc/ha.d/authkeys]
    auth 2
    2 sha1 Ultramonkey!
    
    [/etc/ha.d/ha.cf]
    logfacility     local0
    mcast eth0 225.0.0.1 694 1 0
    auto_failback off
    node    ws01.testlab.local
    node    ws02.testlab.local
    ping 10.0.0.1
    respawn hacluster /usr/lib64/heartbeat/ipfail
    apiauth ipfail gid=haclient uid=hacluster
    
    [/etc/ha.d/haresources]
    ws01.testlab.local      \
            ldirectord::ldirectord.cf \
            LVSSyncDaemonSwap::master \
            IPaddr2::10.0.0.100/24/eth0/10.0.0.255
    
    [/etc/ha.d./ldirector.cf]
    checktimeout=10
    checkinterval=2
    autoreload=yes
    logfile="/var/log/ldirectord.log"
    quiescent=no
    # Virtual Service for HTTP
    virtual=10.0.0.100:80
            fallback=127.0.0.1:80
            real=10.0.0.10:80 gate
            real=10.0.0.20:80 gate
            service=http
            request="alive.html"
            receive="I'm alive!"
            scheduler=wrr
            persistent=1800
            protocol=tcp
              checktype=negotiate
    # Virtual Service for HTTPS
    virtual=10.0.0.100:443
            fallback=127.0.0.1:443
            real=10.0.0.10:443 gate
            real=10.0.0.20:443 gate
            service=https
            request="alive.html"
            receive="I'm alive!"
            scheduler=wrr
            persistent=1800
            protocol=tcp
              checktype=negotiate
    
  35. Set the permission on authkeys

    chmod 600 /etc/ha.d/authkeys
    
  36. Start the httpd server

    httpd -k start
    
  37. Create alive.html in the /var/www/html folder with the following text (set this to whatever file you have set in the monitoring script)

    I'm alive!
    

    Edit the /etc/hosts file to include the FQDN of all of the machines in your LVS (not strictly necessary, but it helps avoid problems)

    # Do not remove the following line, or various programs # that require network functionality will fail.
    127.0.0.1               localhost.localdomain localhost
    10.0.0.10               ws01.testlab.local      ws01
    10.0.0.20               ws02.testlab.local      ws02
    ::1             localhost6.localdomain6 localhost6
    
  38. Edit the /etc/sysconfig/network-scripts/ifcfg-lo file with your virtual IP

    DEVICE=lo
    IPADDR=127.0.0.1
    NETMASK=255.0.0.0
    NETWORK=127.0.0.0
    BROADCAST=127.255.255.255
    ONBOOT=yes
    NAME=loopback
    
    DEVICE=lo:0
    IPADDR=10.0.0.100
    NETMASK=255.255.255.255
    NETWORK=10.0.0.0
    BROADCAST=10.0.0.255
    ONBOOT=yes
    NAME=loopback
    
  39. Edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file to match this (edit the IP address for each director/real server, change from eth0 to whatever active interface you are using):

    [/etc/sysconfig/network-scripts/ifcfg-eth0 on ws01] \
    DEVICE=eth0 ONBOOT=yes BOOTPROTO=static IPADDR=10.0.0.10 NETMASK=255.255.252.0 GATEWAY=10.0.0.1
    
    [/etc/sysconfig/network-scripts/ifcfg-eth0 on ws02] \
    DEVICE=eth0 ONBOOT=yes BOOTPROTO=static IPADDR=10.0.0.20 NETMASK=255.255.252.0 GATEWAY=10.0.0.1
    
  40. Restart the network

    service network restart
    
  41. Enable packet forwarding and arp ignore in the /etc/sysctl.conf file

    net.ipv4.ip_forward = 1
    net.ipv4.conf.eth0.arp_ignore = 1
    net.ipv4.conf.eth0.arp_announce = 2
    net.ipv4.conf.all.arp_ignore = 1
    net.ipv4.conf.all.arp_announce = 2
    
  42. Reparse the sysctl.conf file

    /sbin/sysctl -p
    
  43. Make sure all services set to start at system boot.

    chkconfig httpd on && chkconfig --level 2345 heartbeat on && chkconfig --del ldirectord
    
  44. Start the heartbeat service

    /etc/init.d/ldirectord stop && /etc/init.d/heartbeat start
    

3.3. Keepalived

Keepalived is written by Alexandre Cassen Alexandre (dot) Cassen (at) free (dot) fr, and is based on vrrpd for director failover. Health checking for realservers is included. It has a lengthy but logical conf file and sets up an LVS for you. Alexandre released code for this in late 2001. There is a keepalived mailing list and Alexandre also monitors the LVS mailing list (May 2004, most of the postings have moved to the keepalived mailing list). The LVS-HOWTO has some information about Keepalived.

3.4. ipvsman(d)

Volker Jaenisch volker (dot) jaenisch (at) inqbus (dot) de 2007-07-04

http://sourceforge.net/projects/ipvsman/

ipvsman is a curses based GUI to the IPVS loadbalancer written in python. ipvsmand is a monitoring instance of ipvs to achive the desired state of the loadbalancing as ldirectord or keepalived do.

  • ipvsman/d now comes with tcp regular expression chat to check any tcp-service you can imagine
  • Sorry-Servers can be checked for their avability.
  • Fedora 7 packages are contributed by Gerry Reno.

3.5. Alternate hardware: Soekris (and embedded hardware)

Clint Byrum cbyrum (at) spamaps (dot) org 27 Sep 2004

I'd like to setup a two node Heartbeat/LVS load balancer using Soekris Net4801 machines. These have a 266Mhz Geode CPU, 3 Ethernet, and 128MB of RAM. The OS (probably LEAF) would live on a CF disk. If these are overkill, I'd also consider a Net4501, which has a 133Mhz CPU, 64MB RAM, and 3 ethernet.

I'd need to balance about 300 HTTP requests per second, totaling about 150kB/sec, between two servers. I'm doing this now with the servers themselves (big dual P4 3.02 Ghz servers with lots and lots of RAM). This is proving problematic as failover and ARP hiding are just a major pain. I'd rather have a dedicated LVS setup.

1) anybody else doing this?

2) IIRC, using the DR method, CPU usage is not a real problem because reply traffic doesn't go through the LVS boxes, but there is some RAM overhead per connection. How much traffic do you guys think these should be able to handle?

Ratz 28 Sep 2004

The Net4801 machines are horribly slow but for your purpose enough. The limiting factor on those boxes are almost always the cache sizes. I've waded through too many processor sheets of those Geode derivates to give your specific details on your processor but I would be surprised if it had more than 16kb i/d-cache each.

16k unified cache. :-/

Make sure that your I/O rate is as low as possible or the first thing to blow is your CF disk. I've worked with hundreds of those little boxes in all shapes, sizes and configurations. The biggest common mode failures were CF disk due to temperature problems and I/O pressure (MTTF was 23 days); other problems only showed up in really bad NICs locking up half of the time.

I haven't ever had an actual CF card blow on me. LEAF is made to live on readonly media.. so its not like it will be written to a lot.

Sorry, blow is exaggerated, I mean they simply fail because they only have limited write capacity on the cells.

RO doesn't mean that there's no I/O going to your disk as you correctly noted. The problem is that if you plan on using them 24/7 I suggest you monitor your block I/O on your RO partitions using the values from /proc/partitions or the wonderful iostat tool. Then extrapolate about 4 hours worth of samples, check your CF vendor specification on how many writes it can endure and see how long you can expect the thing to run.

I have to add that thermal issues were adding to our high failure rates. We wanted to ship those little nifty boxes to every branch of a big customer to do a big VPN network. Unfortunately the customer is in the automobile industry and this means that those boxes were put in the stranges places imaginable in garages sometimes causing major heat congestion. Also as it is usual in this sector of industry people are used to reliable hardware and so they don't care if at the end of a working day they simply shut down the power of the whole garage. Needless to say that this adds up to the reduced lifetime of a CF.

I then did a reliability analysis using the MGL (multiple greek letter, derived from the beta-factor model) model to calculate the average risk in terms of failure*consequence and we had to refrain from using those little nifty things. The costs of repair (detection of failure -> replacement of product) at a customer would exceed the income our service provided through a mesh of those boxes.

If these are overkill, I'd also consider a Net4501, which has a 133Mhz CPU, 64MB RAM, and 3 ethernet.

I'd go with the former ones, just to be sure ;).

Forgive me for being frank, but it sounds like you wouldn't go with either of them.

I don't know your business case so it's very difficult to give you a definite answer. I only give you an (somewhat intimidating) experience report, someone might just as well give you a much better report.

I'd need to balance about 300 HTTP requests per second, totaling about 150kB/sec, between two servers.

So one can assume a typical request to your website is 512 Bytes, which is rather quite high. But not really an issue for LVS-DR.

I didn't clarify that. The 150kB/sec is outgoing. This isn't for all of the website, just the static images/html/css.
I'm doing this now with the servers themselves (big dual P4 3.02 Ghz servers with lots and lots of RAM). This is proving problematic as failover and ARP hiding are just a major pain. I'd rather have a dedicated LVS setup.

I'd have to agree to this.

1) anybody else doing this?

Maybe. Stupid questions: How often did you have to failover and how often did it work out of the box?

Maybe once every 2 or 3 months I'd need to do some maintenance and switch to the backup. Every time there was some problem with noarp not coming up or some weird routing issue with the IPs. Complexity bad. :)

So frankly speaking: your HA solution didn't work as expected ;).

2) IIRC, using the DR method, CPU usage is not a real problem because reply traffic doesn't go through the LVS boxes, but there is some RAM overhead per connection. How much traffic do you guys think these should be able to handle?

This is very difficult to say since these boxes impose limits also through their inefficiant PCI busses, their rather broken NICs and the dramatically reduced cache. Also it would be interesting to know if you're planning on using persistency on your setup.

Persistency is not a requirement. Note that most of the time a client opens a connection once, and keeps it up as long as they're browsing with keepalives.

Yes, provided most clients use HTTP/1.1. But since on an application level you don't need persistency.

But to give you a number to start with, I would say those boxes should be able (given your constraints) to sustain 5Mbit/s of traffic with about 2000pps (~350 Bytes/packet) and only consume 30 Mbyte of your precious RAM when running without persistency. This is if every packet of your 2000pps is a new client requesting a new connection to the LVS and will be inserted by the template at an average of 1 Minute.

As mentioned previously, you HW configuration is very hard to compare to actual benchmarks, thus take those numbers with a grain of salt, please.

Thats not encouraging. I need something fairly cheap.. otherwise I might as well go down the commercial load balancer route.

Well, I have given you number which are (at a second look) rather low estimates ;). Technically, your system should be able to deliver 25000pps (yes, 25k) at a 50Mbit/s rate. You would then, if every packet was a new client, consume about all the memory of your system :). So somewhere in between those two numbers I would place the performance of your machine.

Bubba Parker sysadmin (at) citynetwireless (dot) net 27 Sep 2004

In my tests, the Soekris net4501, 4511, and 4521 all were able to route almost 20Mbps at wire-speed. I would suspect the 4801 to be in excess of 50Mbps, but remember, your Soekris board has 3 nics, but what they don't tell you is that they all share the same interrupt, so performance degredation is exponential with many packets per second.

Ratz 28 Sep 2004

For all Geode based boards I've received more technical documentation than I was ever prepared to dive in. Most of the time you get a very accurate depiction of your hardware including south and north bridges and there you can see that the interrupt lines are hardwired and require a interrupt sharing.

However this is not a problem since there's not a lot of devices on the bus anyway that would occupy it and if you're really unhappy about the bus speed, use setpci to reduce latency for the NIC's IRQs.

Newer kernels have excellent handling for shared IRQs btw.

Did you measure exponential degradation? I know you get a pretty steep performance reduction once you push the pps too high but I newer saw exponential behaviour.

Peter Mueller 2004-09-27

What about not using these Soekris's and just using those two beefy servers? e.g., http://www.ultramonkey.org/2.0.1/topologies/ha-overview.html or http://www.ultramonkey.org/2.0.1/topologies/sl-ha-lb-overview.html

Clint Byrum 27 Sep 2004

Thats what I'm doing now. The setup works, but its complexity causes issues. Bringing up IPs over here, moving them from eth0 to lo over there, running noarpctl on that box. Its all very hard to keep track of. Its much simpler to just have two boxes running LVS, and not worry about whats on the servers.

Simple things are generally easier to fix if they break. It took me quite a while to find a simple typo in a script on my current setup, because it was very non-obvious at what layer things were failing.

3.6. LVS on a CD: Malcolm Turnbull's ISO files

Malcolm Turnbull Malcolm (dot) Turnbull (at) crocus (dot) co (dot) uk 03 Jun 2003, has released a Bootable ISO image of his Loadbalancer.org appliance software. The link was at http://www.loadbalancer.org/modules.php?name=Downloads&d_op=viewdownload&cid=2 but is now dead (Dec 2003). Checking the website (Apr 2004) I find that the code is available as a 30 day demo (http://www.loadbalancer.org/download.html, link dead Feb 2005).

Here's the original blurb from Malcolm

The basic idea is creating an easy to use layer 4 switch appliance to compete with Coyote Point Equalizer/ CISCO local director... All my source code is GPL, but the ISO distribution contains files that are non-GPL to protect the work and allow vendors to licence the software. The ISO requires a license before you can legally use it in production.

Burn it to CD and then use it to boot a spare server with pentium/celeron + ATAPI CD + 64MB RAM + 1 or 2 NICs+20GB HD

root password is : loadbalancer
ip address is : 10.0.0.21/255.255.0.0
web based login is : loadbalancer
web based password is : loadbalancer

Default setup is DR so just plug it straight into the same hub as your web servers and have a play.. Download the manuals for configuration info...