[Mailman-Developers] race condition in locking ?

Harald Meland Harald.Meland@usit.uio.no
07 Feb 2000 22:01:36 +0100


[Thomas Wouters]

> Well, attached you'll find the first change, the cvs diff to LockFile.py.

Sorry for not having been around here earlier, but: I've already been
running a non-backwards LockFile.py in our Mailman installation here
at uio.no for over a month.

My current CVS checkout is available ("live" :) under <URL:
http://www.uio.no/~hmeland/tmp/mailman-userdb/>, with <URL:
http://www.uio.no/~hmeland/tmp/mailman-userdb/Mailman/LockFile.py>
being of special interest in this discussion...

> I kept the API the same with one teensy exception. With the new
> locking scheme, it's not possible to 'steal' a lock, only to break
> the lock and then enter the lock() cycle to try and grab it.

Umm, I'm not sure that would go down very well with all of (current)
Mailman, as it would mean you'd have to reload any data that might
have changed on disk (marshals etc.) if someone else got the lock
before you...

I've been taking the approach that the parts of Mailman that actually
call steal() have to know what they're doing (and grepping seemed to
indicate that that was correct -- it's mostly used by forked-off
children to steal locks held by the parent), and thus steal() is
awfully blunt and just overwrites the current lock.

> I've tested this locker quite heavily, mailbombing my listserver
> with 100+ simultaneous messages to mailman-test (which generated
> 100+ bounces sent back to mailman, because my bouncing email address
> is still on the list) and it hasn't barfed yet ;)

Looks good -- I implemented my version of this due to severe locking
problems when our Mailman server nearly ran out of swap when various
log mails started arriving in hordes just after midnight...

Combined with the mm_cfg.EX_TEMPFAIL trick seen in <URL:
http://www.uio.no/~hmeland/tmp/mailman-userdb/scripts/post> this has
kept our Sun SPARCstation 20 with 160MB RAM running our ~3.500 lists
quite smoothly for the last month.

I'll try having a look at what differences there are in the two
versions of this tomorrow.

Cheers,
-- 
Harald