49. LVS: Virtualised Hosts in a Linux Virtual Server

49.1. Introduction

(Jul 2007) This is an amalgamation of posts by Gerry Reno greno (at) verizon (dot) net, Rio rio (at) forestoflives (dot) com and Volker Jaenisch volker (dot) jaenisch (at) inqbus (dot) de starting with this posting LVS and OpenVZ (http://lists.graemef.net/pipermail/lvs-users/2007-June/019329.html).

"realserver" is an LVS term referring to the nodes being loadbalanced by the director. Until now, each realserver has been a separate piece of hardware. But now with virualisation, it's possible for a realserver to be a virtual server (or one of many realservers) running inside a bigger piece of hardware (e.g. a 4 core machine with 16GByte RAM and 4 NICs).

I've never liked the LVS nomenclature; e.g. "virtual", "realserver", but since I couldn't come up with an alternative and no-one else seemed to mind, I've just accepted it. We haven't had too much problems with the word "virtual" since LVS hasn't been loadbalancing anything virtual. However if realservers are going to be virtualised, there's going to be lots of name space collisions.

The first virtualisation enviroment was VMWare, which ran guest OS's (Windows, Linux, MacOS), with each guest appearing to be running on a separate machine. Now as well there is Xen, QEMU, Linux-VServer and OpenVZ.

There are two types of virtualised machines, each of which appear to be separate machines to the user and to the outside network. (I hope I have the nomenclature right here). For a more thorough review of the properties of the various virtualisers see Comparison of virtual machines.

  • VM (virtual machine): each VM has it's own kernel. This is a relatively heaviweight setup and uses more resources on the host. Processes can run amok in a VM without much likelihood of causing problems to other VMs.

    examples: VMWare, XEN, QEMU

  • VE (virtual environment): each VE is a supercharged Linux chroot environment. There is only one kernel running on the node, shared by all the VEs. An errant process is more likely to cause problems to another VE than in the VM situation, but it's still an unlikely possibility. VE hosts are just huge schedulers for the apps that run inside the VE's. There is very little wasted resources. You can get many many more VE's on a host that you can VM's. The VE solution is trailing behind VM solution for now, but as soon as the VE support framework lands in the kernels, then things will change.

    Here's a description from the Linux-VServer webpage

    The Linux-VServer technology is a soft partitioning concept based on Security Contexts which permits the creation of many independent Virtual Private Servers (VPS) that run simultaneously on a single physical server at full speed, efficiently sharing hardware resources.

    A VPS provides an almost identical operating environment as a conventional Linux server. All services, such as ssh, mail, web and database servers can be started on such a VPS, without (or in special cases with only minimal) modification, just like on any real server.

    Each VPS has its own user account database and root password and is isolated from other virtual servers, except for the fact that they share the same hardware resources."

    examples: OpenVZ, Linux-VServer.

Each virtual machine has access to all the hardware resources on a machine (unless restricted by ulimit), so if the other virtual machines are idle, a busy virtual machine will be given all the CPUs. Each virtual machine will have its own virtaul NIC(s).

The reason for moving to virtual machines is cost. Gerry replaced 3 racks of nodes with 1/3 rack of a larger server. Virtual server-clusters speeds up time-to-production for the real server-clusters by an order of magnitude, early enough to steer the project in the right direction, before the customer has invested in expensive hardware. You can predict the number of realservers needed to reach a performance goal. It is more expensive to run 84 pieces of hardware in multiple racks. Electricity costs, physical storage space costs, hardware costs, environmental conditioning costs, etc. It is cheaper to run a few hosts running virtuals than to run real hardware.

Here's a description of one VE setup:

because one web server handling all these websites is a literal mess in organization. one company has their own virtual server because we host 36 websites for them and we wanted to stick them all together... we place 'dangerous' type web admins in separated servers so if they mess it up it doesnt affect the rest of them. a single server would affect them all. other customers, due to one reason or another have their own virtual server and they host a number of their own sites. we have one web server for our 'generically safe' customers which hosts 400 domains and is namespace only so we do glob domains together when it is safe otherwise we protect all our customers from the unsafe ones, or if they have a number of sites, we place them in one server. another reason for making it by customer when they host a number of their own sites is ease of shutting them off if they dont pay. i simply do vserver $servername stop and thats all there is to it. i dont have to hunt individual sites and disable each one.

another reason for using a number of VE as everyone is calling it is if i decide i want to move a service to a different machine, i simply move the entire virtual server over and start it on the new machine and it is done. no playing with o/s configs or anything. the only thing that gets involved is machine architecture change such as moving a virtual from an amd64 to an intel processor i686 machine but since we no longer have that issue to worry about it is simple.

each VE has its own primary ip like any hardware box would. i can also assign multiple ips to virtuals as needed.

unless i specifically tell a VE to use only 1 or 2 processors, they all have use of all 4 as the host machine's kernel decides is best. since there is only the kernel on the host, it decides how the cpus are used by the virtuals unless like i said i specifically configure cpu usage/limits etc per virtual, which i have not had a need to do yet. if i should decide to adjust cpu usage on a particular virtual, i only have to do it on that one, i don't need to compensate with cpu configuration on all the others.

on ours, they simply multihonme the nic. eg.. eth0 primary addr is host, eth0:0 is virtual_1 eth0:2 is virtual_2 etc... to the virtual machine it appears to be its own nic even though the virtual server has no networking code to mess with a nic.

One problem is the setup of the virtual network. You should be familiar with linux-bridges (act like switches) and ip-routing. If firewalling is included things become complex.

In some virtual machines you can live migrate a virtual machine.

  • live migration: OpenVZ
  • no live migration: Linux-VServer

Ben Hollingsworth ben (dot) hollingsworth (at) bryanlgh (dot) org 29 Jun 2007

I had trouble finding anything for RHEL4, so a few months ago, I started writing my own instructions, which you can view at: http://www.jedi.com/obiwan/technology/ultramonkey-rhel4.html Alas, I got sidetracked before I finished the setup, so this document is incomplete. I'm back at it now (hence my presence on the list again), so if anybody more familiar with the project wants to tell me what I've overlooked, I'm all ears.

49.2. Virtualised Realsevers: VMWare/Xen

People are using VWWare to run multiple realservers inside a large machine. This can be cheaper than running the same number of realservers. However be aware that you need a machine with enough resources to handle all the VMWare realservers.

Paolo Penzo paolo (dot) penzo (at) bancatoscana (dot) it 01 Jun 2007

Do not use vmware guests as LVS directors: as soon as free bandwidth goes down, the directors start to failover due to missing sync packages. I was running this setup at the very beginning, but then I moved the directors to physical servers. Vmware can be fine for realservers until the vmware itself is not overloaded, in such case realservers will go in and out of the LVS cluster due to check mechanism.

Stuart walmsley stuart (at) vio (dot) com 01 Jun 2007

I am running a LVS pair using keepalived with just over 20 mixed windows / Linux hosts in 10 clusters all running over 3 VM ESX servers. Each server has 16 cores, 64 Gig of memory and multiple bonded Gig NIC's with all the OS's living on a shared 4Gig SAN fabric. It performs very well and has meet all requirements in terms of both performance and Availability and offers great flexibility. As is always the case with VM ware you must ensure you have enough I/O and memory to meet the needs of all the virtual hosts or overall performance will be miserable.

Sebastian Vieira sebvieira (at) gmail (dot) com, 1 Jun 2007

Yes, i can imagine that with such hardware, hosting 20 machines is a piece of cake. Still, the possibility of one guest pulling all resources to itself, exists. And ESX doesn't have a quota/limit feature to deal with these problems, which is why i can't recommend putting LVS on VMware. The bonded NICs help :)

Stuart walmsley

VM actually enables very granular resource allocation both at the server instance level and resource group level allow the grouping of sets of application servers. The cost of the hardware is lower that 10 individual servers and still has expansion capability so is a very cost effective route. LVS was a critical piece of the puzzle when looking at how to halt my server sprawl and enable a virtualization strategy.

Gary W. Smith gary (at) primeexalia (dot) com 1 Jun 2007

We also run several of instances in VMWare as well. We have separate disks for the OS and the guests, which helps with the disk bottleneck. As for the performance, our VM's are limited to single CPU's so the host general has plenty of CPU time to do its work. We've been running this configuration for some time now (two years?, don't remember when the first one was put up but I do remember we were using Linux-HA 1.0). Anyway, we really haven't had any problems at all. OTOH, we have had problems with Xen clustering, which other people have claimed works better. But in our case I think it's the underlying distro.

Alexander Osorio maosorionet (at) gmail (dot) com 2 Jun 2007

I have a machine with 2 dual core processors, for Linux i have 4 processors, and running only one realserver in the machine is wasting the other processors. So, i'm looking for a way to run in the same PC, for example, 4 process (one per processor) but i can have only one ip:port for the clients, the clients are POS wireless terminals that connect to one ip:port, and i need to do load balancing between the connections.

Graeme Fowler graeme (at) graemef (dot) net 03 Jun 2007

Run a Xen host. Have several Xen guests - one for the director, and two for the realservers. That way, although it's all virtual, it'll look real to the guests.

Horms 5 Jun 2007

You can tag a process to a processor with the (newish) cpusets feature. Though I think in this case that would be somewhat silly.

If you want to use LVS for this, wouldn't an easy way be to bind the processes to 127.0.0.1 and 127.0.0.2 respectively and set them up as the real-servers in LVS.

That said, using fork-on-connect or preforking in the user-space application, and making sure you have at least as many processes as processors, is likely to be an easier and better way to go.

49.3. Running a test LVS (director, backup director and realservers) on one box (UML, VMWare)

Can I load both the ipvs code and the failover code in a single stand alone machine?

Joe 09 Jul 2001

VMWare?

Henrik Nordstrom hno (at) marasystems (dot) com

user-mode-linux works beautifully for simulating a network of Linux boxes on a single CPU. Use it extensively when hacking on netfilter/iptables, or when testing our patches on new kernels and/or ipvs versions. Also has the added benefit that you can run the kernel under full control of gdb, which greatly simplifies tracking kernel bugs down if you get down to kernel hacking.

Joe

I attended a talk by the UML author at OLS 2001. It's pretty smart software. You can have virtual CPUs, NICs... - you can have a virtual 64-way SMP machine running on your 75MHz pentium I. The performance will be terrible, but you can at least test your application on it.

49.4. VMWare problems with ntp

Apparently linux running under VMWare doesn't keep time.

Todd Lyons tlyons (at) ivenue (dot) com 16 Nov 2005

ntp under vmware causes major problems. The longer it runs, the more it "compensates" for things. It jumps back and forth, further and further each time, until after a few days, it is jumping back and forth *hours*, wreaking all kind of havoc on a linux system that's using nfs or samba. I've seen a vmware system that exhibited this with both RedHat and Gentoo. The original poster is correct to be using ntpdate instead of ntp daemon. It's the only way to keep the time reasonably close. Personally, I'd tell him to do it more often, such as:

  * * * * * /usr/sbin/ntpdate time.server.com >/dev/null 2>&1

substitute your own internal time server for "time.server.com".

Sebastiaan Veldhuisen seppo (at) omaclan (dot) nl 16 Nov 2005

This has nothing to do with LVS and/ or heartbeat. I guess you are running a Linux guest within a Linux host vmware server (or ESX)? If so, there are known problems with clock fluctiations in guests VM's. We run our Development servers on VMWare ESX and GSX and had large clock fluctuations. The VMWare TID's weren't directly much helpfull in solving the problem.

How we fixed it:

  • - vmware-linux-tools are not helpfull in solving this problem. You don't need them to fix the time issue
  • - On the VMWare Server management console webinterface, go to Options , Advanced Settings, and search for the option Misc.TimerHardPeriod. Default value is 1000 , adjust it to 333.

On the linux guest machine:

  • -For Grub edit: /boot/grub/menu.lst add "clock=pmtmr" to add the end of your current kernel and reboot.
  • -For Lilo edit: /etc/lilo.conf and add to the append rule of your current kernel "clock=pmtmr". Run lilo and reboot.

-This should fix your problem (run ntpd on both host and guest OS, no vmware-tools)

More info on this issue (not appropriate fix though):

http://www.vmware.com/support/kb/enduser/stdadp.php?pfaqid=1339
http://www.vmware.com/support/kb/enduser/stdadp.php?pfaqid=1420  
https://www.vmware.com/community/thread.jspa?forumID21&threadID13498&messageID=138110#138110 
https://www.vmware.com/community/thread.jspa?forumID21&threadID16921&messageID=185408#185408  
http://www.vmware.com/support/kb/enduser/stdadp.php?pfaqid=1518

Bunpot Thanaboonsombut bunpotth (at) gmail (dot) com 18 Nov 2005

VMware KB is erroneous. Add "clock = pit" in the same line of "kernel" in grub.conf like this

kernel /vmlinuz-2.6.9-22.EL ro root=/LABEL=/ rhgb quiet clock=pit

Kit Gerrits kitgerrits (at) gmail (dot) com 22 Dec 2008

It -IS- possible to keep several hosts in sync time-wise in VMware, even on a simple laptop. I am currently running an LVS cluster and 2 webservers on my laptop with VMWare Server 1.0.7 and 1GB RAM. There have been no spontaneous cluster failovers (yet) today.

  • For GSX / VMware Server: Pass the following options to the kernel via grub.conf:
    clock=pit nosmp noapic nolapic
    
  • For RHEL5/CentOS5, you should use the following:
    clocksource=pit nosmp noapic nolapic
    

Because this will disable all of Linux' intelligent clock tricks, you'll need to run NTPd to keep your clock in sync. More info at: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1420

For ESX servers, the solution is 'simpler': You need to lower the minimum time between clok requests:

Configuration --> Software --> Advanced Settings --> Misc --> Misc.TimerMinHardPeriod

Lower this value (it is in microseconds) More info at: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2219

49.5. Xen tcpip checksum bug

Note
the fix is at the end of the posting.

Matthias Saou thias (at) spam.spam.spam.spam.spam.spam.spam.egg.and.spam.freshrpms.net 7 Aug 2007

I'm setting up various Xen guests, and want to use LVS to load-balance web traffic across them. I've tried two similar simple setups, and with both I see the same issue where LVS doesn't work properly when the director send the request to a real server on the same physical Xen host.

Scenario 1 :

  • - 3 physical servers (Xen Hosts) with eth0 and eth1
  • - 3 web servers (Xen guests), one per host, listening only on eth1
  • - LVS NAT is configured using keepalived on the first Xen Host

When I make a web request to the LVS director, it works fine when it sends it to the 2nd or 3rd web servers, but only gets about the first 12kb of the page when it sends it to the 1st web server (the only one on the same Xen Host as LVS). For pages smaller than 12kb, no problem.

Scenario 2 :

  • - 3 physical servers (Xen Hosts) with eth0 and eth1
  • - 3 web servers (Xen guests), one per host, listening only on eth1
  • - 1 LVS director (Xen guest), on the first Xen Host, eth0 and eth1

The exact same problem happens. Here are a few more details :

  • - RHEL5 x86_64 with latest 2.6.18-8.1.8.el5xen kernel
  • - keepalived package from Fedora recompiled for RHEL5
  • - net.ipv4.ip_forward = 1 on the LVS director
  • - -A POSTROUTING -s 192.168.0.0/255.255.0.0 -o eth0 -j MASQUERADE

If I remove the "local" web server from the keepalive/LVS configuration, it works fine, since it only sends to the real servers on the other physical servers, but that would mean not using the first physical server's CPU power and memory, which I don't want to be wasting.

I'm pretty sure this has something to do with connection tracking and/or the bridges Xen configures, but I don't know what to try to fix the issue.

what happens if you have the director(s) on a separate host, i.e. not the Xen host?

Then it works. I tried to keep it short in my initial email, but maybe I've kept it too short. There is no problem when the director has no real server on the same Xen host (physical machine). And it's not a web server issue, as all Xen guests are perfect clones of each other. For the record, the final setup I wanted was like this :

                   Internet
                      |
    +---------+---------+---------+---------+  Public LAN
    |         |         x         x         x
    |         |
  Xen1      Xen2      Xen3      Xen4      Xen5
  Web1+LVS1 Web2+LVS2 Web3      Web4      Web5
    |         |         |         |         |
    +---------+---------+---------+---------+  Private LAN

With Xen1 and Xen2 running two keepalived instances with VRRP for redundancy and LVS+NAT for web load-balancing and failover, through the Private LAN.

This setup works, expect for the issue I've outlined :

- When LVS1 is active, requests to Web1 have issues
- When LVS2 is active, requests to Web2 have issues

I'm pretty sure it's some obscure(-ish) Linux bridge or connection tracking problem. In this particular setup, it's not that much of an issue (5 real servers, only 4 used at a given time), so what I did was exclude Web1 from LVS1's configuration and Web2 from LVS2's. But I'm now setting up a similar configuration with only 3 physical servers, so excluding one means 1/3rd less "horse power" for the cluster, which is quite a waste.

I'm quite surprised no one has run into this before. Has anyone here already set up LVS with Xen in some similar way and have it work?

I've continued searching, and I've found this post on the xen-users list reporting a similar problem : http://lists.xensource.com/archives/html/xen-users/2006-11/msg00480.html As Xen gains popularity, I guess we'll be more and more facing this issue. I'll continue digging to try and find a solution.

I'm still convinced it has something to do with connection tracking and bridges, but I still haven't been able to debug it.

Basically packets go like this when the issue is seen :

- dom0 peth0 ->
- dom0 xenbr0 ->
- dom0 vif7.0 ->
- domUa eth0 -> This is where LVS is running
- domUa eth1 ->
- dom0 vif7.1 ->
- dom0 xenbr1 ->
- dom0 vif10.1 ->
- domUb eth1 -> This is where the web server answers
- dom0 vif10.1 ->
- dom0 xenbr1 ->
- dom0 vif7.1 ->
- domUa eth1 -> This is where SNAT/MASQUERADE occurs
- domUa eth0 ->
- dom0 vif7.0 ->
- dom0 xenbr0 ->
- dom0 peth0 -> Back to the Internet

dom0 : Xen Host
domUa : Xen guest running LVS+NAT using dom0's vif7.0 and vif7.1
domUb : Xen guest running a web server using dom0's vif10.1 only

There is nothing "fancy" in my setup, meaning that I've only configured the minimum possible iptables rules to get things working, and it actually works but only sends back partial files to the client. With a test php script doing a phpinfo() I always got around 12kB, but I since tried with a simple static file from which I always get exactly 16384 Bytes, while the file itself is a few hundred Bytes long. I'm pretty sure that value of 16384 Bytes isn't a coincidence...

When domUa queries a real server on a different physical machine, the main difference is that instead of going through xenbr1, from vif7.1 to vif10.1, it goes to peth1 and off to the other Xen Host's NIC. But it actually "stays inside xenbr1" too, which is why I'm confused.

Graeme Fowler graeme (at) graemef (dot) net

Humo(u)r me. If the following isn't set to 0 already, try it:

echo 0 > /proc/sys/net/ipv4/tcp_sack

It's possible that you're hitting a bug which was fixed in 2.6.12... you shouldn't have it in 2.6.18, but anything is possible. Especially regressions.

For more details around this, see https://lists.netfilter.org/pipermail/netfilter/2005-June/061101.html and http://linuxgazette.net/116/tag/6.html

The symptoms are very similar indeed. I just tried, but it didn't help. Thanks a lot for the suggestion, though. Does anyone know how I could try and track down the TCP connection problem? I.e. know if it's the Xen host, the LVS director Xen guest or the web server Xen guest which is "getting something wrong"? I'm been doing a lot of basic tcpdumps, and only see that at some point, clients are still receiving data from the LVS address, to the same port even, but no longer consider it as the followup to the previously received data.

later...

Daniel P. Berrange (who hacks extensively on Xen over at Red Hat) suggested it might be a TCP checksum offload issue... and it was! The solution is simply to use ethtool -K ethX tx off on all relevant interfaces, and it all starts working as expected.

Joe: turning off checksums could cause other problems. I hope there's a better solution somewhere.

49.6. Random observations thrashing around trying to get Xen/LVS-NAT working

Josh Mullis josh (at) mullis (at) cox (dot) com 17 Sep 2008

The traffic still gets to the vm, but just can't seem to make it back out through the NAT. Here's my setup

 - 1 physical server running as Xen Dom0 (Director)
         -LAN ip: 10.0.0.80
         -NAT ip: 192.168.122.1
                 -Natting is setup thorugh default xen network scripts
 
         -ipvsadm -A -t 10.0.0.80:53 -s rr
         -ipvsadm -a -t 10.0.0.80:53 -r 192.168.122.10:53 -m
         -ipvsadm -A -u 10.0.0.80:53 -s rr
         -ipvsadm -a -u 10.0.0.80:53 -r 192.168.122.10:53 -m
 
 
 - 1 domU (realserver) on this box (Will add others in the future)
         -ip: 192.168.122.10
         -gw: 192.168.122.1
         -running BIND

From a host on the 10.0.0.0 network, I can do a dig @10.0.0.80 and do not get a response. I do however see the traffic on the 192.168.122.10 virtual machine from this host on the 10.0.0.0.

ipvsuser ipvsuser (at) itsbeen (dot) sent (dot) com

This may not be helpful, but I run a bunch of stuff on vanilla domUs F8 and never have had any trouble

  • I don't mess with the default bridge or networking set up by Xen/libvirt
  • I use DR, not NAT
  • I use a domU for the director as well as the real servers
  • I use keepalived because the only thing the director is "HA"ing is the VIPs, so HA/ldirector seems like overkill in this case. Put the backup director on a separate dom0.
  • I have domU real servers on the same dom0 as the director and on other dom0s and I haven't seen any problems.
  • I would recommend not putting any part of the LVS stuff directly on the dom0 and seeing if that works. Also, I am a DR die hard, but I think esp. in the case of Xen, it seems to work great, fast set up, painless.

Graeme

Simple question: does the realserver (the VM, 192.168.122.10) have a route direct back to the 10.0.0.0/whatever network? More specific routes will override the default, so having a direct route means the traffic will not necessarily traverse the director and will therefore not be un-NATted on the way back. Is there some sort of virtual ethernet bridge affecting it with both network segments on the same "virtual cable"?

On the realservers, the default route *must* be via the notional "inside" interface of the director for LVS-NAT to work. If the default route goes a different way, then the traffic returning to the client is not un-NATted correctly and may result in a hung connection.

There is an exception, however: if the clients come from a small, known, pool of addresses (which may apply in your case) then there must be a route back from the clients to that network range (or those ranges) via the director so that un-NATting can happen. Other traffic - such as that sourced from the realserver for example for OS updates - can go whichever way you want it to, and in fact I normally make it my practice to ensure that the traffic emanating from the realservers for this type of operation doesn't appear to come from the VIP anyway.

In summary: for NAT to work, traffic back to clients must go via the director.

Josh

Only has def gateway of 192.168.122.1, which knows how to get to 10.0.0.0 . Tried the direct routeanyway, but did not help. "route add 10.0.0.0 gw 192.168.122.1" I can do a dig from the physical server OS to the 192.168.122.10 vm, which is going through the bridge. This works perfect.

Laurentiu C. Badea (L.C.) lc (at) waat (dot) com

Xen creates a virtual bridge and adds a few iptables rules to control access and do NAT for its clients, while the host domain becomes their gateway. So you have the LVS setup sitting on top of a NAT router.

I would take a look at the iptables setup and check the packet counters during a query, especially on reject rules. Then try to insert rules to make it work and make sure the ruleset is maintained across reboots (Xen dynamically inserts rules when the bridges are brought up).

David Dyer-Bennet dd-b (at) dd-b (dot) net

Try iptables-save (not iptables -L) to see *all* the tables (in an incompatible format). The default route on each of the realserver "systems" (quotes to remind us that they may be xen guests not physical systems) needs to be set to the private net virtual IP of the LVS system. And the LVS NAT works *only* for packets routed in by the LVS; the realservers can't initiate outgoing connections beyond the private LAN (unless you turn on ordinary NAT on the LVS, which is not the same thing as LVS NAT).

You don't need the "secondary" addresses for the LVS nodes (that's the terminology from piranha-gui; one of the problems here is that there's lots of conflicting terminology being used in different tools around all this stuff).

The LVS node needs two real (corporate lan, or internet) IPs and two virtual (private network for the cluster) IPs; one each outside and inside. The virtual IPs will be moved between the two LVS nodes on failover, whereas the real IPs will stay with the hardware they're assigned to.

The private virtual IP must be configured in each realserver as the gateway node.

My impression is that getting that wrong is both the easiest and most common way of getting packets to go in but not come out; but that impression may be specific to redhat/centos setups using piranha-gui, which you haven't mentioned using (but it's what I'm using, hence it influences what I know).

Here's my ipvsadm output:

sh-3.2# /sbin/ipvsadm
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  prcvmod01.employer.local     wlc
  -> 172.17.0.4:http              Masq    8      0          0
  -> 172.17.0.3:http              Masq    8      0          0
sh-3.2#

172.17. is the private internal lan for the cluster. .4 and .3 there are realservers, separate physical systems in my case (they'll have Xen and multiple virtual servers on them soon).

prcvmod01.employer.local is the public (corporate lan) virtual IP for the service (it's 192.168.1.16). The LVS system itself has its own corporate lan IP of 192.168.1.14.

The iptables look like:

sh-3.2# /sbin/iptables-save
# Generated by iptables-save v1.3.5 on Wed Sep 17 14:32:03 2008
*filter
:INPUT ACCEPT [7375193:1143767059]
:FORWARD ACCEPT [396083:75791540]
:OUTPUT ACCEPT [6115668:423106080]
-A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A FORWARD -d 192.168.122.0/255.255.255.0 -o virbr0 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/255.255.255.0 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -m physdev  --physdev-in vif5.0 -j ACCEPT 
COMMIT
# Completed on Wed Sep 17 14:32:03 2008
# Generated by iptables-save v1.3.5 on Wed Sep 17 14:32:03 2008
*nat
:PREROUTING ACCEPT [1498599:131914689]
:POSTROUTING ACCEPT [398187:28611428]
:OUTPUT ACCEPT [409849:29413401]
-A POSTROUTING -s 192.168.122.0/255.255.255.0 -j MASQUERADE
-A POSTROUTING -o eth0 -j MASQUERADE
COMMIT
# Completed on Wed Sep 17 14:32:03 2008

The most obvious difference is just that you're running your virtual server on the same box, and I'm not. I started out trying to run it all on one box, and gave up. I think I now understand the things I gave up on, and I think it should work in just one box with my current config -- but I haven't tried it yet. I'm going to get it working in the simple case of multiple boxes first, and *then* add in the complexity of some realservers on the same hardware as the LVS director.