11. LVS: Non-LVS clients on Realservers

This HOWTO is a little disorganised here. Read the section on lvs clients on realservers too.

11.1. always NAT out clients through VIP

Note
This section (Jan 2007) is a collection of material that previously has been scattered thoughout the HOWTO, including in the old sections on 3-Tier LVS's and authd.

In its simplest form, an LVS is a highly available server. Realservers are servers only: they reply directly to the client and don't need to connect to other machines to do so. This model serves well for telnet (used for testing) and the widely deployed http. With http as the most often deployed service, this model lasted a surprisingly long time.

It wasn't long before we found that realservers were required to do more than just serve: client processes on the realserver made calls, often back to the LVS client (the CIP). The first client we found was authd/identd which connects to the CIP. We didn't know what to do with this client and since it wasn't needed and could be turned off, we did just that, solving the immediate problem. We assumed we had a one-off problem that we wouldn't see again and we didn't see any bigger picture. The write-up on authd/identd is long, not because anyone needs to understand it in any depth, but because it was an big problem with LVS in the early days and we put some effort into figuring it out.

Next, for LVS's running a web based database, a client process on the realserver connects to the database machine (a 3-Tier setup). The database was running on a machine under our control and the connection was local and was easily handled. We thought we'd handled another special case.

Now and again an administrator would want access to the outside world from a realserver, or a script would need to pull from the internet (sometimes requiring access by a DNS client running on the realserver). These cases were handled by NAT'ing out the connection through some convenient machine (often the director). Again these were treated as another special case and implemented slightly differently for LVS-NAT/DR/Tun. These connections came from the primary IP on the outside of the director and not the VIP. By the time we figured this out, no-one was running identd anymore and the identd case was not revisited.

It took a while for the next step; Francois JEANMOUGIN (masquerade the client out through the VIP) realised that you could NAT out the connection through the VIP. This solution wasn't often needed, since you could usually pull data or resolve hostnames, no matter what IP you used to make the call. We forgot about this trick and Graeme Fowler had to reinvent it. (We still hadn't "got it".)

The ftp-data connection, in the standard two port ftp service, requires similar handling, but for LVS-NAT ftp has its own helper, while for LVS-DR/Tun ftp is handled by persistence. Again we regarded this as another special case.

However some server processes on the realserver also make calls to the internet, e.g. an MTA which receives e-mail on the VIP, and which forwards the e-mail, must forward it from the VIP. When there are multiple VIPs, each with its instance of the server process, client calls, from each instance of the server process, must be NAT'ed out through the appropriate VIP.

David M was the first to describe a working multiple VIP/multiple client setup, masquerading clients on realservers through multiple VIPs, which showed the generalisation that we'd been missing: clients running on the realservers, which are calling on behalf of a server process listening on the VIP (or RIP for LVS-NAT), have to call from the VIP.

Thus an MTA on the realservers listening on the VIP, when it connects to another MTA, has to connect from the VIP. In an LVS'ed DNS, when named makes a connect to other machines, these calls must come from the VIP. In contrast, the client call for name resolution for the MTA client, doesn't have to come from the VIP, since the name resolution is not being LVS'ed. The connect request from a database client running on the realserver, which accesses a database on LAN, doesn't have to come from the VIP, since the database call is not being LVS'ed.

If you're unsure as to whether the call needs to come from the VIP, think of the standalone server; which IP does the client call need to come from?

After seeing David's solution, I scanned for unsolved problems on the mailing list, to find postings about server setups that worked on a standalone server, but which didn't work in an LVS. These setups were behind a director using NAT rules, where the client process emerged with src_addr!=VIP, but which required src_addr=VIP. (No we didn't fix the problem, presumably the poster(s) went to a commercial solution.)

The lesson from this is to nat your realserver client processes from the VIP, unless you're sure that it's not needed. The rest of this section is just amplification of this statement. If you understand David M's posting on masquerading clients on realservers through multiple VIPs then you're done here.

11.2. Masquerading clients on realservers to the outside world (SNAT)

Note
also see reinject snat

Sometimes you a client process on the realserver will need to contact the outside world, e.g.

  • the LVS'ed server process may need to run a client process to connect to another computer e.g. to access a database, or to initiate an smtp connection to the next MTA in the chain.
  • the LVS'ed process may make a callback to a process running on the LVS client (e.g. the ftp-data port with ftp)
  • A process independant of the LVS'ed service may need to periodically connect to an outside computer e.g. ftp to upload logs, or DNS (the realserver knows the CIP already, so this won't be for the LVS'ed service).

Clients on realservers can call from the RIP or VIP. By default, clients will call from the RIP, since it is the primary IP on the realserver. Often the client of the LVS or an outside machine will expect the call to come from the VIP, which is handled by NAT'ing the call. If the LVS has multiple VIPs, then the call must come from the correct VIP.

  • RIP

    Clients like telnet call from the RIP as so do the clients of some callbacks e.g. rshd. Some services e.g. MTAs which receive e-mail on the VIP will initiate sending e-mail from the RIP, this being the primary IP on the NIC.

    Usually the RIP is a private IP and will not be routable. If the resources needed by the client are local e.g. to a local nameserver with its own connection to the internet, or to a database server, then a non-routable RIP is fine. If you need to route packets from a routable IP, you could make the RIPs routable. but from the security point of view, you don't want to make your realservers publically accessible, so making the RIP routable is not generally a good idea.

  • VIP

    Clients which are associated with a service listening on the VIP and which make callbacks from the VIP to the LVS client. The instances that we know about of this.

    The general solution for callbacks from the VIP is to write a helper module for the director. If you don't have one, then you're stuck - in this case look at the section on authd/identd for attempts at solutions. A possible solution is to use persistence with port=0 as can be done for ftp (port=0 forwards all ports, increasing security problems and should not be used if at all possible).

To handle calls from the RIP, you can NAT the connections out through any available box: for LVS-NAT, the director is available; for LVS-DR/LVS-Tun both the director and the default gw box are possibilities (although you may not have access to the default gw box). In the case of LVS-NAT, the director is the already the default gw for packets from the RIP (since you need to route the replies from the LVS'ed service through the director). In the case of LVS-DR/LVS-Tun, the default gw for packets from the VIP is through a router that is not the director: the default gw for packets from the RIP is not part of the LVS setup, but will probably also be the same router box. In this case, the packets from the RIP will need to be routed instead to the director (you can use the iproute2 tools for this).

If you don't do anything special, the NAT'ed requests will come from the primary IP on the outside of the director (the VIP is usually a secondary IP, so that it can be moved on failover). Below we show how to make the call from the director's VIP. In the case of LVS-DR/LVS-Tun, the VIP on the outside of the director doesn't send any packets, and doesn't need a route (see routing for LVS-DR). If you NAT out through the VIP on an LVS-DR or LVS-Tun director, then you will need to put in a default gw for packets from the VIP (you normally don't have a default route for packets from the VIP for LVS-DR or LVS-Tun).

11.3. Masquerading clients on LVS-NAT realservers

Here's the command to run on a 2.2.x director to allow realserver1 to telnet to the outside world.

director:# ipchains -A forward -p tcp -j MASQ -s realserver1 telnet -d 0.0.0.0/0

With LVS-NAT and a single director, the VIP will be the primary IP on the outside of the director and the packets will have src_addr=VIP. Otherwise the packets will come from an IP which is not the VIP.

You may have to turn off icmp redirects, if you have a one network LVS-NAT.

director: #echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects
director: #echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects

After running this command you can telnet from the realservers. You can do this even if telnet is an LVS'ed service, since the telnet client and demon running on the realserver operate independantly of each other.

Here are the IP:port, seen by `netstat -an` on each machine

  • client on the internet telnet'ing to an LVS forwarding by LVS-NAT

    client                  director             realserver
    
    connection from client to LVS
    CIP:1041->VIP:23          -                  CIP:1041->RIP:23
    
  • the realserver connecting by masquerading through the director to the telnetd on the LVS client.

    client                  director             realserver
    
    telnet connection from realserver to telnetd on LVS client
    CIP:23<-DIP:61000         -                  CIP:23<-RIP:1030
    

    The masqueraded connection to the LVS client comes from the primary IP of the director (here the DIP) and not from the VIP, which in this setup is an alias (secondary IP) of the DIP.

    The masqueraded ports can be seen on the director with

    director:/etc/lvs# ipchains -M -L -n
    IP masquerading entries
    prot expire   source               destination          ports
    TCP  14:53.91 RIP                  CIP                  1030 (61000) -> 23
    

For both connections, the director doesn't have connections to any of its ports. It the case of LVS, the director is just forwarding packets like a router. In the masquerading case, the director is rewritten the headers before forwarding the packets like a router.

Connections from clients start at high_port=1024. The masqueraded ports start at port=61000 (not 1024) (at least for kernel 2.2.x). The port number increments for each new connection in both cases. In the case where a machine is both connecting to the outside world (using ports starting at 1024) and masquerading connections from other machines (using port starting at 61000), there is no port collision detection. This can be a problem if the machine is masquerading a large number of connections and the port range has been increased.

Note
The masqueraded ports start at (64k-4k)=61440 for 2.2.x kernels. 2.4.x kernels can use all ports for masquerading.

Peter Klapprodt peter (dot) klapprodt (at) ewido (dot) net 21 Jul 2005

Any ideas on how to get internet access working on the real servers (i.e. clients unrelated to the LVS services) using LVS-NAT? I've read something about virtual_routes in keepalived but couldn't find any detailed instructions yet.

graeme (at) graemef (dot) net

..in exactly the same way you would for an ordinary masqueraded network:

  • realservers use active director as default gateway
  • on director

    echo "1" >> /proc/sys/net/ipv4/ip_forward
    
  • on director, set up masquerading:

    iptables -t nat -A POSTROUTING -s <priv net>/<netmask> -d <privnet>/<netmask> -j ACCEPT
    iptables -t nat -A POSTROUTING -s <priv net>/<netmask> -j MASQUERADE
    

and that's it! Any packet which returns to the director which is not hooked by LVS as part of an active connection will fall through to the nat POSTROUTING chain and get masqueraded.

PMilanese (at) nypl (dot) org 22 Jul 2005

Note
Do not use the static interface assignment for the gateway. Use the virtual (dynamic) interface (the DIP). If the directors fail over you need the gateway to move with the active director.

11.4. Masquerading clients on LVS-DR realservers

The realserver in LVS-DR has two IPs, the RIP and the VIP. The LVS'ed services are running on the VIP. Packets from LVS'ed services, returning from the realserver, have src_addr=VIP. The RIP is not directly involved in the LVS. Services may be running on the RIP too, e.g. telnetd which listens to 0.0.0.0, but services running on the RIP are of no interest to a LVS-DR. The director only needs the RIP to determine the target MAC address to forward packets from the clients destined for the VIP. Thus you are free to do whatever you like with the RIP without affecting the LVS. Usually the RIP is on a private IP (eg 192.168.x.x) so as to not require an extra IP, and to shield the realserver from the internet. It would be unusual to run non-LVS'ed services on the realservers, as the RIP would have to be a public IP and the realservers would have to be firewalled. However there it is reasonable to run clients on the realservers. A client session ( e.g. telnet) initiated from the RIP would have to be NAT'ed out to the outside world. The NAT box could be the router or the director. Here's how to setup with the director doing the NAT'ing (the router setup would be the same).

11.4.1. Send client packets (src_addr=RIP) to the director and LVS packets (src_addr=VIP) to the router

This is not possible with the standard destination-based route command. You need the policy routing tools from iproute2.

Here's Julian's recipe (25 Sep 2000) for setting up NAT for clients on realservers in a LVS-DR LVS.

For the realserver(s), send all packets from the RIP network (RIPN) to the DIP (an IP on the director in the RIPN).

#create a rule with priority 100, which says that for any packet
#with src_addr in the RIP network, lookup the action to take in table 100.
realserver: #ip rule add prio 100 from RIPN/24 table 100

#route all packets in table 100 which go to 0/0
#(ie anywhere, the default route), via the DIP.
realserver: #ip route add table 100 0/0 via DIP dev eth0

#the result of this is that packets with src_addr=RIPnetwork
#and dst_addr=0/0 go via the DIP.

The director has to to listen on DIP (if it doesn't already), and not send ICMP redirects from the DIP ethernet device and has to masquerade (all) packets from the RIPN.

director: #ifconfig eth0:1 DIP netmask 255.255.255.0
director: #echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects
director: #echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects
# for 2.2 kernels, all services
director: #ipchains -A forward -s RIPN/24 -j MASQ 
# for 2.2 kernels, telnet only
director: #ipchains -A forward -p tcp -j MASQ -s realserver1 telnet -d 0.0.0.0/0 

11.4.2. add a default route for packets from the primary IP on the outside of the director

For LVS-DR, no default gw is needed for packets from the primary IP on the outside of the director or from the VIP (which will be an alias/secondary IP). For security reasons then none is installed. To allow masquerading of clients on the realservers, a default route will be needed for packets from the primary IP on the outside of the director (but not for packets from the VIP).

If you want to test this out first, just put in a default route for the director using the route command. If you like it you can add the more restrictive routes with iproute2 later.

11.5. Masquerading clients on LVS-Tun realservers

The director is on a different network (possibly in a different location), you don't have a two way ipip connection back to the director (although you can add one), and you don't have a route from the RIP to the DIP (although you can add this too). If you handle these problems, then you can use the director to NAT out connections from the realservers. However it would probably be simpler to NAT out through the local router.

11.6. Masquerading clients through the VIP on the director

The recipes above for masquerading clients, have the packets coming out from the primary IP on the outside of the director. This will not usually be the VIP, which is a secondary IP (so that it can be moved easily on failover). Here we show how to masquerade out from the VIP.

11.6.1. Masquerading through a single VIP

Francois JEANMOUGIN Francois (dot) JEANMOUGIN (at) 123multimedia (dot) com 19 Aug 2004

When masquerading clients on realservers out through the director, how do I make the src_addr=VIP?

"C. R. Oldham" cro (at) ncacasi (dot) org 25 Aug 2004

You can do this with policy-based routing in the 2.6 series of kernels. On my Debian realservers I have this in /etc/networks/interfaces

auto eth0 eth1
iface eth0 inet dhcp

#Define an interface eth1 that uses inet protocols and has a static address
iface eth1 inet static

   #Give the interface the address of 192.168.0.2
   address 192.168.0.2

   #And a netmask of 255.255.255.0
   netmask 255.255.255.0

   #When the interface is brought up, execute 'ip route' adding an entry to
   #the routing table that causes packets with src address 192.168.0.2 to be
   #processed with the iptables table called 'lvs'
   up ip route add 192.168.0.0 dev eth1 src 192.168.0.2 table lvs

   #When the interface is brought up, set the default route for the table lvs 
   #to 192.168.0.1 (which is my lvs director).
   up ip route add default via 192.168.0.1 table lvs

   #Add another routing rule so packets going from 192.168.0.2 are also
   #processed by table lvs.
   up ip rule add from 192.168.0.2 table lvs

   #When the interface is brought down delete the routing rules.
   #these rules lie dormant till the interface is brought down.	
   down ip rule delete from 192.168.0.2 table lvs
   down ip route delete 192.168.0.0 dev eth1 src 192.168.0.2 table lvs

I have a table "lvs" in iproute2/rt_tables

#
# reserved values
#
255     local
254     main
253     default
0       unspec
#
# local
#
1       inr.ruhep
80      lvs

It took me a long time and lots of googling to figure this out but it works great.

Francois JEANMOUGIN

Just use snat!

director:# /sbin/iptables -t nat -A POSTROUTING -o eth1 -j SNAT --to $VIP

It is pretty simple. The VIP does not have to be up on the system, the rule stays there unemployed. In case of a director switch, even if vrrp add the VIP as a secondary (or alias) interface, the outgoing packets will have the VIP as the source address. Using iptables with the SNAT method let you use vrrp for director failover without any other configuration and scripts.

Tested and approved (my VIP is a secondary interface now again on the directors). I think you can use several SNAT rules if you want to mix several natted virtual_servers, using a -s (IIRC) option (that part I didn't test).

P.S.: Yes, I feel, the "--to" option confusing too.

Joe - It took a long time for someone to realise how to make the packets come from the VIP, rather than the primary IP on the outside of the director. The same problem came up again, but I'd forgotten that it had been solved, so it was invented again.

Kristoffer Egefelt

If I send a mail from a realserver to my gmail account, the outgoing packets have the primary IP of the director as src_addr. I would like the packets to come instead from the VIP.

Graeme Fowler graeme (at) graemef (dot) net 22 May 2006

You want a machine (the realserver) behind a masquerading server (the director) to appear to have a fixed IP address when making outbound connections to the internet. Simply have a SNAT rule on your director's external interface such that packets going out from the realserver get mapped to the VIP; assuming here that the external interface is eth0:

iptables -t nat -I POSTROUTING -o eth0 \
                 -s $REALSERVER_IP \
                 -d 0/0 \
                 -j SNAT --to-source $VIRTUAL_IP

I've used this many times to do a many-to-one mapping for realservers so that when they initiate external connections, they appear to come from the same IP.

Since this is outbound data from a high port on the VIP, and not from a port controlled by ipvsadm, the ip_vs code on the director will ignore these packets and they will be reverse SNAT'ed and pass to the realserver. This works is for outbound communication from the realservers; it's extremely unlikely that they'll use a well-known (and often priveleged) service port as the source for a new TCP session to somewhere external.

In context, an example mail server cluster will generally have one or more of ports 25, 465 and 587 bound to the VIP on the external side of the director. No well-written MTA will initiate a connection to an external host using those ports as source. The same goes for webservers, DB servers and a whole host of others.

That means the LVS doesn't have to be considered, as the netfilter conntrack code will work perfectly well.

There is, however, an exception - DNS servers can be configured to use UDP/53 as a source port for queries; in my experience explicitly turning this off means a tiny proportion of queries will fail. Leaving it turned on behind a director means that, well, anything could happen... so making use of a forwarder here is a good solution. Besides, in DNS operation having a query come from a reversible IP which maps to a forward name lookup is less important than it is for web or email connections.

Brad Dameron brad (at) seatab (dot) com 19 May 2006

you can use iptables to push packets from certain realservers out certain IP's. Here is my /etc/init.d/ipvs_firewall startup script. This script also allows your real servers to connect to the outsite world through the LVS server. This is a SuSe start script so will need to be a little modified to work with RedHat, etc.

Chris Newland chrisn (at) allipo (dot) com 11 Jul 2006

I use LVS-NAT and SNAT by using the following iptables rule:

iptables -t nat -A POSTROUTING -s 10.0.0.0/255.255.255.0 -o eth0 -j SNAT \
--to-source x.x.x.x <public IP of your director>

My realservers only have non-routable IP addresses (10.0.0.*) The realservers can all connect to servers on the internet and when they do, the IP source address is that of the director.

11.6.2. Masquerading through multiple VIPs

David M northridgeaustin (at) gmail (dot) com 14 Dec 2006

We have an LVS-NAT which works fine for other services (e.g. http). We also LVS sendmail. The MTA listens for connections on the RIP (and works fine), but when it initiates a connection (which is does from the RIP), this occurs independantly of the LVS. Outgoing connections from RIPs get routed out the default gateway for LVS-NAT, where they're NAT'ed by iptables rules on the director.

We have three sendmail realservers, each with 30 private (172.16.0.0/24) RIPs, each RIP with an instance of sendmail (30 instances/realserver; 90 private RIPs total). On the Director, there are 30 public VIPs which are balanced by the three realservers. On each realserver then, MTA connections can be initiated from 30 RIPs, and all are sent to the same default gateway (the DIP). The director needs to know through which VIP the connection needs to NAT'ed out. The director then needs 90 rules (one for each RIP).

We have three realservers (RS1, RS2, RS3), and we are associating RIPs with VIPs. Here's the subset for VIP_01

#RIP on RS1 that services VIP_01, connections come out from VIP_01
$RIP_RS1_VIP_01 --> $VIP_01  
$RIP_RS2_VIP_01 --> $VIP_01
$RIP_RS3_VIP_01 --> $VIP_01

#iptables rules
$IPT -t nat -A POSTROUTING -s $RIP_RS1_VIP_01 -o $EXT_INTER -j SNAT --to-source $VIP_01
$IPT -t nat -A POSTROUTING -s $RIP_RS2_VIP_01 -o $EXT_INTER -j SNAT --to-source $VIP_01
$IPT -t nat -A POSTROUTING -s $RIP_RS3_VIP_01 -o $EXT_INTER -j SNAT --to-source $VIP_01

Rob ipvsuser (at) itsbeen (dot) sent (dot) com 15 Dec 2006

Well, the way I set up things up is different (possibly better) - My goal is to make it easy to config/manage/troubleshoot, secure, fast and low load on the director(s):

  • I use OpenBSD and pf to separate public and private IP spaces
  • Use LVS-DR for all the lvs work (not sure if you can do this or if you need to use nat for some other reason)

By separating the NATing from the load balancing it seems to simplify the configuration of both and I feel it is easier to write pf rules than iptables (YMMV). In pf for each of the 30 email servers you need 2 rules:

Outgoing: nat pass on $ext_if inet proto tcp from 172.16.1.1 to port 25 -> px.py.pz.1
Incoming: rdr pass on $ext_if inet proto tcp from any to px.py.pz.1 port 25 -> 172.16.1.1 port 25

The above will send incoming connections to the correct VIP and keep the outgoing connections/replies coming from the correct public IP.

For the LVS config:

-A -t 172.16.1.1:25 -s nq
-a -t 172.16.1.1:25 -r 172.16.1.101:25 -g -w 100
-a -t 172.16.1.1:25 -r 172.16.1.102:25 -g -w 100
-a -t 172.16.1.1:25 -r 172.16.1.103:25 -g -w 100

No special routing set up on the director or real servers, all machines have the OpenBSD firewall as their gateway. Low load on the director since it is DR. Then to cheat on the arp issue, I hardcode the MAC Address of the director into the arp table on the OpenBSD firewall for each of the VIPs (and run arpwatch and set the Linux machines arp sysconfig params) One of the cool things you can do with a set up like this is use the excellent table handling in pf. I have about 85,000 ips that I know are spammers and I don't want them using any resources on my MTA boxes so I redirect all of them to OpenBSD's spamd which tarpits them at extremely low cost:

table <spammers> persist file "/etc/spammers.txt"  {}
rdr pass on $ext_if inet proto tcp from {<spammers>} to any port 25 -> 127.0.0.1 port 8027

This means that the MTA boxes can service real mail more quickly since slots are not being used by spammers. I do similar things for bogons http://www.cymru.com/Bogons/ and ssh brute force attackers. I haven't found a reasonable way to work with any sizable tables in iptables.

11.7. 3-Tier LVS

However some services need resources on other machines, e.g DNS, databases. A squid realserver gets its content from machines on the internet and to do this, the squid demon will run a client process which makes a connection from RIP to 0/0:80. These client packets need to be routed and to do so the RIP must first be on a public IP (or at least routable locally).

Sorting out the routing requirements for setting up a 3-Tier LVS was prompted by Jezz Palmer (Mar 2002) who found that his squid didn't work when setup by the configure script, but did when he put in a default route for the squid realserver. Jezz ran the tcpdumps, ran and debugged the scripts for me.

11.8. Routes needed for 3-Tier LVS

Figuring out the iptables and iproute2 commands was helped by Horms, Ratz, Julian and Peter Mueller.

Here is the standard LVS-DR test setup with 2-NIC director and only 1 realserver. The router for the realservers has the LVS client. The routes neccessary for a normal LVS are in lower case (e.g. from 0/0 to VIP). Note (see discussion of routes for LVS-DR) that there is no route for packets from the VIP on the director (to anywhere) and no routes for packets from the SERVER_GW to RIP,VIP on the realserver.

        ____________
       |            |
       |   client   |SERVER_GW-------------
       |____________|                     | ^
             CIP                          | |
              | from 0/0 (CIP) to VIP     | from VIP to 0/0 (CIP) via SERVER_GW
              |  |                        |
              |  v                        | ^
             VIP                          | |
        ____________                      | FROM RIP TO 0/0:PORT (CIP) VIA SERVER_GW
       |            |                     | FROM 0/0:PORT (CIP) TO RIP
       |  director  |                     | |
       |____________|                     | v
             DIP                          |
              |                           |
              |----------------------------
              |
           RIP,VIP
        _____________
       |             |
       | realserver  |
       |_____________|

In UPPER CASE are the routes which need to be added to turn the LVS into a 3-Tier LVS (e.g. FROM 0/0:PORT to RIP) where "PORT" is the port for the client running on the RIP. Note that the gw for 0/0:PORT (here SERVER_GW) can be another router - it does not have to be the SERVER_GW. Note also that the dst_addr does not have to be 0/0 - a more restrictive dst_addr could be used if the IPs of the 3rd tier machines are known ahead of time (e.g. DNS servers, database servers).

In the original LVS-DR setup (1999, or configure scripts upto version 0.8.x) the routes for the realserver were

from RIP to RIP_network via eth0
default gw via SERVER_GW

In LVSs setup by the configure script 0.9.x, packets from the VIP are sent to the default gw. Packets from the RIP to 0/0 are sent via the DIP (where they are filtered i.e. DROPed or REJECTed)

from RIP to RIP_network via eth0
from VIP to 0/0 via SERVER_GW
from RIP to 0/0 via DIP

In LVSs setup by the configure script v 0.10.x and later, selected packets from the RIP are sent to the 3_TIER_GW (which may be the same as the SERVER_GW).

from RIP to RIP_network via eth0
from VIP to 0/0 via SERVER_GW
from RIP to selected_IPs:selected_ports via 3_TIER_GW
from RIP to ! RIP_network prohibit

11.9. Setting up routes using iptables and iproute2

The problem then becomes one of routing packets from RIP to 0/0:80 (if the realserver is a squid) while making sure that no other packets from RIP to any other ports on 0/0 are DROP'ed or REJECT'ed. For 2.2 kernels running ipchains there is no way of doing this, and all packets to 0/0 have to be routed. For 2.4 kernels, iptables allows marking (fwmark) by dport (or sport). After marking, packets can be routed by iproute2.

The configure script (v 0.10.x or later) will set this up for you. (May 2002, it's being tested as we speak, coming Real Soon Now). Here's a standalone version of the code in the configure script that marks the packets.

#!/bin/bash

#**************************
#NOTE: ADD THE LINE
#201    3_TIER
#to /etc/iproute2/rt_tables
#**************************

#---------------------------
#user modify section
RIP="192.168.1.11"
VIP="192.168.2.110"
#realserver will be allowed to connect to 0/0:OUTSIDE_PORT
#The port can be a number (eg 80) or a name in /etc/services (eg http).
#for a squid the port is http/80
#OUTSIDE_PORT="telnet"
#OUTSIDE_PORTS="192.168.2.254:telnet 0:80 192.168.2.254:1024:65535 0:auth"

#gw for packets coming from clients on realserver to 0/0:OUTSIDE_PORT
#(probably will be same as SERVER_GW in lvs_xxx.conf)
OUTSIDE_PORT_GW="192.168.1.254"

#from lvs_xxx.conf file
DIP="192.168.1.9"

#device carrying RIP
RIP_DEV="eth0"

DEBUG=Y                 #Y||N

#end user modify.
#---------------------------

#don't modify this.
#note mangling can only be done on the OUTPUT and the PREROUTING chains.
#CHAIN=PREROUTING       #for packets coming in, not what we want here.
CHAIN=OUTPUT            #for altering locally-generated packets before routing
OUTSIDE_PORT_CHAIN="3-Tier_rules"

#original code was
#iptables -N $OUTSIDE_PORT_CHAIN
#following a set of posting by
#Justin Albstmeijer justin (at ) VLAMea (dot) nl in  Oct,Nov 2003
#pointing out problems he was having, Ratz said this line should be
iptables -t mangle -N $OUTSIDE_PORT_CHAIN
#the old code is in the configure-lvs script, which I guess I'll
#fix sometime.

iptables -F $OUTSIDE_PORT_CHAIN
iptables -A $OUTSIDE_PORT_CHAIN -j MARK --set-mark 1

#---------------------------

#now stuff happens
#iptables section
iptables -F -t mangle

#mark packets from RIP to outside service
#note: you need -p protocol if you are using --dport
#iptables -t mangle -A ${CHAIN} -p tcp -s ${RIP}/32 -d 0/0 --dport ${OUTSIDE_PORT} -j MARK --set-mark 1
#for each OUTSIDE_PORT
OUTSIDE_IP="192.168.2.254"
OUTSIDE_PORT="telnet"
iptables -t mangle -A ${CHAIN} -p tcp -s ${RIP}/32 -d $OUTSIDE_IP --dport ${OUTSIDE_PORT} -j $OUTSIDE_PORT_CHAIN
OUTSIDE_IP=0
OUTSIDE_PORT="auth"
iptables -t mangle -A ${CHAIN} -p tcp -s ${RIP}/32 -d $OUTSIDE_IP --dport ${OUTSIDE_PORT} -j $OUTSIDE_PORT_CHAIN

#my test setup requires passing auth packets for telnet, or else telnet is delayed
#iptables -t mangle -A ${CHAIN} -p tcp -s ${RIP}/32 -d 0/0 --sport auth -j MARK --set-mark 1
#you can't do an ip rule on fwmark ! 1, so mark the unwanted packets too.
#iptables -t mangle -A ${CHAIN} -p tcp -s ${RIP}/32 -d 0/0 --dport ! ${OUTSIDE_PORT} -j MARK --set-mark 2

if [ "$DEBUG" = "Y" ]
then
        rm /var/log/debug
        kill -HUP `cat /var/run/syslogd.pid`
        iptables -t mangle -A ${CHAIN} -m mark --mark 1 -j LOG --log-level DEBUG --log-prefix "fwmark 1:
 "
fi

#show iptables
iptables -L -t mangle

#-----------------------------
#ip section

#packets from $RIP with fwmark 1, lookup table 3_TIER
ip rule add prio 99 from ${RIP} fwmark 1 table 3_TIER
#in table 3_TIER, add entry that all packets go via $(OUTSIDE_PORT_GW)
ip route add default via ${OUTSIDE_PORT_GW} dev ${RIP_DEV} table 3_TIER
#stop disallowed packets.
ip rule add prio 101 from ${RIP} fwmark 2 prohibit

#This may not be needed if the -t mangle is included above
#when `iptables -t mangle -N $OUTSIDE_PORT_CHAIN` is run.
#I haven't checked it yet.
#
#not sure why I need this one.
#There are some theories, none of which I can say for sure is it.
#me - so there is a route for packets to 0/0 when the routing table needs
#to check if packets can get there (the routing table doesn't know about the fwmark)
#Julian - so the client can get its src_addr.
#apparently clients are bound to 0.0.0.0,
#in which case they get their src_addr from the routing table.
#However (hopefully) all packets from $RIP to 0/0 to outside
#will have been stopped by the prohibit rule.
ip route add default from ${RIP} via ${DIP} table main

#show everything
ip rule show
ip route show table 3_TIER
ip route show table main

#Here's the output

#realserver:/etc/rc.d# ip rule show
#0:     from all lookup local
#100:     from 192.168.2.110 lookup VIP
#99:     from 192.168.1.11 fwmark        1 lookup 3_TIER
#100:    from 192.168.1.11 to 192.168.1.0/24 lookup RIP
#100:    from 192.168.1.11 lookup RIP
#101:    from 192.168.1.11 lookup main prohibit
#32766:  from all lookup main

#realserver:/etc/rc.d# ip route show table 3_TIER
#default via 192.168.1.254 dev eth0

#realserver:/etc/rc.d# ip route show table main
#192.168.2.110 dev lo  scope link  src 192.168.2.110
#192.168.1.0/24 dev eth0  scope link
#192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.11
#127.0.0.0/8 dev lo  scope link
#default via 192.168.1.9 dev eth0

#-------------------------------------------------

Francoisflafolie (at) aic (dot) fr Apr 26 2007

It seems I need the following rules to make my setup work. The iprules have to have as ip source address the VIP and not the RIP.

ip rule add from 10.0.22.171 table ftp_table
ip rule add from 10.0.23.100 table http_table

Here's the original problem I posted. I have installed and configured keepalived (v1.1.13).

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.0.23.100:http wlc persistent 600
  -> 192.168.15.11:http           Masq    100    0          0
TCP  10.0.22.171:ftp wlc persistent 600
  -> 192.168.15.10:ftp            Masq    100    0          0

I'm trying to manage different services on different VLANs on my loadbalancer.

eth0.26 : vlan 10.0.22.0/24 for ftp
eth0.28 : vlan 10.0.23.0/24 for http

The problem is I can configure only one default route on my loadbalancer. For example, if my default route is 10.0.23.1, request and reply for http (vlan 10.0.23.0) both going in the good vlan. But for ftp, request will be on the good vlan (10.0.22.0) but reply on vlan 10.0.23.0 (my firewall authorizes that for tests) and not 10.0.22.0.

I have tried to define some iprules on my loadbalancer to say if the source ip address is 192.168.15.10, so forward packets to 10.0.22.0 network but it seems doesn't work. LVS apparently don't let the routing decisions to the operating system after its own operations... Here are my iprules :

ip rule add from 192.168.15.10 table ftp_table
ip rule add from 192.168.15.11 table http_table

ip route add default via 10.0.22.1 dev eth0.26 table ftp_table
ip route add default via 10.0.23.1 dev eth0.28 table http_table
ip route flush cache

I also tried that but no more effect :

ip route add default scope global nexthop via 10.0.22.1 dev eth0.26 weight 1
nexthop via 10.0.23.1 dev eth0.28 weight 1

11.10. from the mailing list

TC Lewis has NAT ntp clients in LVS-DR realservers running on the realserver. He is using NAT through the director rather than routing the packets directly as is described here. An LVS-DR director normally does not have a default route and this would have to be added to NAT packets through the director. You may be able to NAT through the router instead.