[Mailman-Users] mailman still wedges on FreeBSD
Justin Wells
jread at fever.semiotek.com
Tue Sep 21 19:45:53 CEST 1999
I've upgraded to the most recent snapshot, and it's a big improvement:
the wedged processes no longer cause a deadlock. They still get wedged,
and eat up my memory, but now the webpage continues to function, and
mail gets through.
This is much better, but I would like to have no wedged processes.
They wedge in "select" state, along with a zombie, which is the
same as before. The difference now is that there are not an additional
10-15 processes wedged in "lockf" state--so it looks like at least
these wedged processes aren't holding onto the lock now.
There's nothing in the log indicating anything unusual at that time.
Here is what I see in my process table during a typical wedge:
fever:~$ ps -auxwww | grep python
daemon 16440 31.8 0.0 0 0 ?? Z - 0:00.00 (python)
daemon 17202 29.8 0.0 0 0 ?? Z - 0:00.00 (python)
daemon 16439 0.0 1.0 2952 1228 ?? I 11:48AM 0:00.31 /usr/local/bin/python /local/mailman/scripts/post webmacro
daemon 17201 0.0 1.0 2956 1232 ?? I 1:21PM 0:00.31 /usr/local/bin/python /local/mailman/scripts/post webmacro
There's also two zombies present, in addition to the wedged processes.
Here's the files the younger wedged process has open:
bash-2.03# lsof -p 16439
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 16439 daemon cwd VDIR 0,131072 512 2 /
python 16439 daemon rtd VDIR 0,131072 512 2 /
python 16439 daemon txt VREG 0,131077 426448 255570 /usr/local/bin/python1.5
python 16439 daemon txt VREG 0,131077 63652 182570 /usr/libexec/ld-elf.so.1
python 16439 daemon txt VREG 0,131077 151057 277905 /usr/lib/libreadline.so.3
python 16439 daemon txt VREG 0,131077 15084 277800 /usr/lib/libtermcap.so.2
python 16439 daemon txt VREG 0,131077 115780 277779 /usr/lib/libm.so.2
python 16439 daemon txt VREG 0,131077 12965 278069 /usr/lib/libdescrypt.so.2
python 16439 daemon txt VREG 0,131077 583043 277814 /usr/lib/libc_r.so.3
python 16439 daemon txt VREG 0,131077 13176 104686 /usr/local/lib/python1.5/lib-dynload/cStringIO.so
python 16439 daemon txt VREG 0,131077 49516 104687 /usr/local/lib/python1.5/lib-dynload/cPickle.so
python 16439 daemon 0u PIPE 0xc720d540 16384
python 16439 daemon 1w VREG 0,131076 0 158753 /var (/dev/wd0s1e)
python 16439 daemon 2w VREG 0,131076 0 158753 /var (/dev/wd0s1e)
python 16439 daemon 3u PIPE 0xc720dea0 16384 ->0xc720ee40
python 16439 daemon 4u PIPE 0xc720ee40 16384 ->0xc720dea0
python 16439 daemon 5r VREG 0,131079 3775 317843 /local/mailman/scripts/post
python 16439 daemon 6u VREG 0,131079 43749 24292 /local/mailman/logs/error
python 16439 daemon 7u VREG 0,131079 10390 24298 /local/mailman/logs/post
bash-2.03#
I would guess what's happened here is the process is deadlocked
waiting waiting for the zombie (16440) on its pipe.
The events that took place are this:
After processing several messages to the list successfully, some
post at 11:48 wedged, apparently after delivering mail to part
(possibly all) of the subscribers. The next post to the list came
through at 1:21PM and also wedged, again after delivering mail to
part of the subscriber list.
I think that if I do nothing, all subsequent posts to the list will
leave a wedged process, after apparently working. If I wipe out the
wedged processes, subsequent posts will work without leaving anything
behind.. for awhile, then it will happen again.
What can I do to investigate this and find out what is going on?
I'm assuming that no-one can tell what's happening just based on
this, or I would have got an answer last time.
I'm running FreeBSD 3.2 with Python 1.54, with mailman CVS updated
from the archive and reinstalled last night.
Justin
On Fri, Sep 17, 1999 at 03:34:27PM -0400, Barry A. Warsaw wrote:
>
> >>>>> "JW" == Justin Wells <jread at fever.semiotek.com> writes:
>
> JW> Sorry to be so persistent about this, but every 2nd day I've
> JW> had to unwedge mailman, and I still haven't go an answer from
> JW> this list.
>
> JW> platform: FreeBSD 3.2, Python 1.5.2, mailman 1.0
>
> JW> This sounds a lot like the locking problem described in
> JW> README.BSD, however my version of python is up to date.
>
> Try upgrading to the current CVS snapshot of Mailman. It uses a more
> portable (and consistent) locking mechanism.
>
> -Barry
>
> ------------------------------------------------------
> Mailman-Users maillist - Mailman-Users at python.org
> http://www.python.org/mailman/listinfo/mailman-users
More information about the Mailman-Users
mailing list