[Mailman-Developers] .nfs* lock files

Harald Meland Harald.Meland@usit.uio.no
05 Jun 2000 00:02:00 +0200


[Barry A. Warsaw]

> >>>>> "HM" == Harald Meland <Harald.Meland@usit.uio.no> writes:
> 
>     HM> Cheap shot at making this post on-topic: Should there be a
>     HM> cron script (possibly cron/checkdbs will do) to warn the site
>     HM> admin about lists that have been corrupted?
> 
> I could probably elaborate on the mechanism in MailList.Load() --
> see the CVS tree -- where if config.db is missing, it'll fallback to
> config.db.last.  If it falls back, then it shutil.copy()'s
> config.db.last to config.db so the logic in Save() doesn't need to
> be changed.

Yup, nice work!

I'm wondering whether it would be possible to take things one step
further, though.

No matter how foolproof we try making Save(), the ultimate test is
really whether Load() succeeds.  Thus, why not let Load() be the one
to do the config.db -> config.db.last rotation?

I'm thinking of a mechanism along these lines:

  Save():
    1. Write new db to tempfile, e.g. config.db.<pid>.<host>
    2. If successful, rename() tempfile on top of config.db

  Load():
    1. Try loading config.db
    2. If successful, make config.db.last a hardlink to current
       config.db (overwriting previous config.db.last)
    3. If loading config.db failed, try loading config.db.last
    4. If loading of config.db.last was successful, make config.db a
       hardlink to config.db.last
    5. Failure.  Maybe notify someone?

There are some locking issues that would have to be resolved (Load()
can be used with unlocked lists), but I think this is doable.  If it
is, we'd gain a "last known good configuration" mechanism.

> We could probably do the same thing if the unmarshalling of
> config.db fails.  But why you'd get a MemoryError is beyond me,
> unless the corruption tickles a bug in Python.

The (current) representation of some marshalled objects are on the
form

  type identifier (a single character, e.g. 's' for string)
  size of the object (a 32-bit integer, e.g. '\003\000\000\000')
  contents of the object (e.g. 'foo')

By changing the size field in a marshalled string, it is quite easy to
produce a MemoryError (assuming your machine doesn't have huge amounts
(in the example below: 1GB) of memory available, in which case
executing the statements below might not be a very good idea :)

  $ python
  Python 1.5.2 (#0, Apr  3 2000, 14:46:48)  [GCC 2.95.2 20000313 (Debian GNU/Linux)] on linux2
  Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
  >>> import marshal
  >>> marshal.loads('s\000\000\000\100')
  Traceback (innermost last):
    File "<stdin>", line 1, in ?
  MemoryError

-- 
Harald