[Mailman-Users] Re: run away process - cannot flock

Paul H Byerly paul at thcwd.com
Mon Jul 21 18:17:33 CEST 2003


Brad Knowles  wrote:
>At 10:17 PM -0500 2003/07/20, Paul H Byerly wrote:
>
> >       My main mail logs have this entry 11 times in one second.
> >
> >  Jul 18 22:30:31 svr01 sendmail[24527]: h6J3UVp24527:   7: fl=0x0,
> >  mode=100644: size=12288
> >  Jul 18 22:30:31 svr01 sendmail[24527]: h6J3UVp24527: SYSERR(root):
> >  cannot flock(/etc/mail/access.db, fd=7, type=1, omode=37777777777,
> >  euid=0): No locks available
>
>         As far as sendmail is concerned, this is a pretty serious
>problem.  This could be the fault of a kernel that is not configured
>sufficiently well to properly host the mailing list.  You may need to
>edit your kernel definitions, recompile and relink it, then reboot.
>This may need to be done several times, in order to find a suitable
>value for this number.  You may be lucky enough to find that this
>value can be tuned interactively, without actually rebuilding the
>kernel.

      That new server in October is looking better and better.  This server 
has been running e-mail for just over a year, and Mailman for a month, and 
these 2 are the first this has happened.

>         Or, this could be the fault of processes grabbing too many locks
>and not releasing them.  In that case, the only thing you can do
>immediately is to reboot the box, but the misbehaved programs should
>be found and fixed or you'll just have to go through this over and
>over again.

      The error message seems to support that.  Killing the PID does the 
trick, which sure beats rebooting.


> >       Which seems to be the end of it - except that mail man would
> >  not let go.  I let it run 15 minutes the second time, to see if it
> >  would die on it's own.
>
>         Mailman should not let this go, but then it also should not be
>re-trying so quickly.  It should schedule a re-try later, and give
>the system time to recover before the next attempt.  This re-try
>timeout should be measured in terms of minutes or hours, not seconds
>or micro-seconds.

      Either a patch or 2.1.3 should take care of that - waiting for some 
word on that fix.

<>< Paul 





More information about the Mailman-Users mailing list