[Mailman-Users] Load-balancing Mailman in LVS cluster

Brad Knowles brad.knowles at skynet.be
Wed Jun 30 02:02:38 CEST 2004


At 9:32 AM +1000 2004-06-30, Guy Waugh wrote:

>  The system I'm building already has apache on each of the two application
>  servers in the cluster, and the web docroot is NFS-shared between the two
>  from the third server I mentioned above, so there shouldn't be any dramas
>  with the web archives that Mailman generates (unless there are file locking
>  issues with these...?).

	I wouldn't expect the web archives would have problems with locking, no.

	But keep in mind that this is a very small part of what Mailman does.

>                           Similarly, sendmail is a standalone app on
>  both servers, so actually sending mail shouldn't be a problem. Mailman
>  will be sending mail to other servers outside the cluster (i.e. no user
>  accounts exist within the cluster). So, my only problem (I think) with
>  this is going to be with Mailman...

	That seems likely.

	The other pieces of this puzzle are pretty well understood on the 
scalability side of the picture, and you can pull out a whole host of 
known workable solutions, depending on your particular needs.

>  I wasn't aware of those, so thanks for letting me know. We do run RHEL3,
>  so GFS would be an option, but for US$2,200, I think I'd have an uphill
>  battle justifying it. I see that NFS has an option of 'noac' (no attribute
>  caching) which sounds potentially useful for me - I don't know whether
>  that directly relates to file locking, though.

	You need noac when sharing filesystems like this for other 
reasons, but it has nothing to do with file locking.

	The problem is that locking is handled outside of the NFS 
protocol per se.  You have lock manager daemons running on both the 
server and the client, and while NFS is supposedly stateless, they 
are not.  And it is not uncommon for the server and client lock 
manager daemons to get out-of-sync in a busy environment.  In 
addition to handling locking, these daemons also handle mount 
requests.

	NFSv4 is being re-written to become more stateful and to bring 
the management of locks inside the base protocol, so that you don't 
have to worry about lock managers that lose their minds or simply 
roll over and die, and filesystems that can be read from and written 
to because the NFS side of the server is still working fine, but 
which cannot handle locking or be mounted or unmounted because the 
lock manager daemon has died.


	Scaling NFS servers in a write-intensive environment is a very 
hard task.  You end up doing all sorts of crazy things to avoid any 
kind of lock creation (much less contention).

	Proper cluster-aware filesystems avoid these kinds of issues, and 
make it much easier to scale the systems involved.  However, as you 
noted, they are expensive.  The question becomes how much is your 
time worth, and how much do you lose when everything goes 
Tango-Uniform (T**ts-Up)?  Here, you've got to look not only at your 
direct loss of revenue, but also the cost of lost opportunities.


	There's a reason why cluster filesystems are so expensive -- this 
is hard to get right.  Moreover, people will pay big money for those 
applications which *do* get it right.  They've done the cost/benefit 
analysis and they figured out that if it takes one of their engineers 
an extra month to build the system, their MTBF is 1/100th what it 
would be, and their time/cost to repair is higher due to the custom 
nature of the solution, then the stuff pays for itself in the first 
outage -- or the first outage that they avoid.

	I *still* haven't seen anything to compare with VaxCluster 
solutions in this field that were created something like fifteen or 
twenty years ago.  Some of those things are still running, for that 
reason.


	Don't get me wrong, NFS is great.  But if you're trying to build 
a scalable network solution, it can be a very poor choice, depending 
on the application.

-- 
Brad Knowles, <brad.knowles at skynet.be>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
     -Benjamin Franklin, Historical Review of Pennsylvania.

   SAGE member since 1995.  See <http://www.sage.org/> for more info.




More information about the Mailman-Users mailing list