[rescue] web server loadbalancing...

Greg A. Woods rescue at sunhelp.org
Thu Aug 2 22:15:06 CDT 2001


[ On Thursday, August 2, 2001 at 20:04:24 (-0400), s at avoidant.org wrote: ]
> Subject: Re: [rescue] web server loadbalancing...
>
> Once upon a time, I worked for a major financial firm. I had a server
> that HAD to be 100% available, but wasn't given the budget to buy the
> hardware I really could have used to make it so. SO... I took a second,
> identical server and had it mirror the data on the first in real time.
> Then I wrote a script that checked if THE server was up, every thirty
> seconds or so. If it wasn't, said script copied important files into the
> right places and restarted networking. Total downtime was never over 30
> seconds, which was acceptable. The only real problem was that the
> servers had to be on the same subnet, obviously.
> 
> Not so difficult after all.

Yes, for a backup fail-over scenario (as opposed to load balancing),
such an approach is quite trivial.

You can even do tricks like make the service address(es) be aliases on
the real machine's interfaces (i.e. keep unique IP#s for each machine)
and then you don't even have to restart networking, just flip a couple
of ifconfigs and maybe publish the ARP entries.  The down time will only
be the delay you program into your checks, especially if you force a
packet for each new address out to the router with the new machine's MAC
-- that'll have the router automatically flip where it sends the packets
for the service address(es) without waiting for ARP timeouts, etc....

You can also get a bit smarter so that the failed machine simply takes
over as the backup server after it recovers (i.e. so it rsync's from the
new main server).  All you need to do is check on boot whether or not
the service(s) up or not, and if it is then become the backup, else
become the primary.  Add a hard-coded delay early in the startup script
for the machine that you want to be the normal backup server so that
recovery from a catasrophic failure of both machines doesn't have them
both coming up as the primary.

We've done this a couple of times too and it works very well.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods at acm.org>     <woods at robohack.ca>
Planix, Inc. <woods at planix.com>;   Secrets of the Weird <woods at weird.com>



More information about the rescue mailing list