[Mailman-Users] Mailman no longer working or working very slowly
Mark Sapiro
mark at msapiro.net
Thu Feb 18 00:46:28 CET 2010
Steven Jones wrote:
>
>Is there a way forward? I have to hand this on as I have to fix something else...will come back to it in about 4 hours from now....
>
OK, As you see, I too have been off line for a while.
>
>-----Original Message-----
>From: Mark Sapiro [mailto:mark at msapiro.net]
>Sent: Thursday, 18 February 2010 3:56 a.m.
>To: Steven Jones; mailman-users at python.org
>Subject: Re: [Mailman-Users] Mailman no longer working or working very slowly
>
>Steven Jones wrote:
>>
>>Yesterday after 5 years of operation ou mailman application has "died" it seems to be barely running taking 3 or more hours to process lists...previous load was <0.2 now its around 1 with a python process absorbing 1 CPU constantly.
>
>
>Which python process - i.e.which qrunner.
>
>============
>how do I identify such a "qrunner?"
>============
ps -fwu mailman
will list all the Mailman processes and the commands that invoked them
which include the qrunner names.
>Also look in Mailmans qfiles/* directories to see which one has a large
>number of messages.
>
>==========
>its empty,
>
>[root at vuwunicosmtp004 mailman]# cd /var/spool/mailman/
>[root at vuwunicosmtp004 mailman]# ls -al
>total 12
>drwxr-xr-x 3 root root 4096 Sep 6 2005 .
>drwxr-xr-x 16 root root 4096 Feb 17 15:12 ..
>drwxrwsr-x 2 root mailman 4096 Mar 22 2007 qfiles
>[root at vuwunicosmtp004 mailman]# cd qfiles/
>[root at vuwunicosmtp004 qfiles]# ls -al
>total 8
>drwxrwsr-x 2 root mailman 4096 Mar 22 2007 .
>drwxr-xr-x 3 root root 4096 Sep 6 2005 ..
>[root at vuwunicosmtp004 qfiles]# ls -al
>total 8
>drwxrwsr-x 2 root mailman 4096 Mar 22 2007 .
>drwxr-xr-x 3 root root 4096 Sep 6 2005 ..
>[root at vuwunicosmtp004 qfiles]#
>
>==========
Then your qfiles are somewhere else. Even if the queues are empty,
there will still be a directory per queue
archive bad bounces commands in news out retry shunt virgin
(well maybe not 'bad' but all the others)
>
>>We are running on RHEL3-32bit and the errors are,
>>
>>============
>>
>>Error log for Mailman (vuwunicosmtp004.vuw.ac.nz:/var/log/mailman/error)
>>
>>says:
>>
>>
>>
>>RuntimeError: maximum recursion depth exceeded
>>
>>
>>
>>Feb 17 07:48:26 2010 (13368) SHUNTING: 1266281245.229378+bd3f1d42e27ad38cf532b809460a0b0a8aef00e7
>>
>>
>>
>>The last number is a message ID in Mailman queue
>>
>>/var/mailman/qfiles/shunt/1266281245.229378+bd3f1d42e27ad38cf532b809460a0b0a8aef00e7.pck
>>============
>
>
>How many of these are there?
>
>========
>seems a lot.....
>
>The error log is significantly bigger because of them,
>
>[root at vuwunicosmtp004 mailman]# pwd
>/var/log/mailman
>[root at vuwunicosmtp004 mailman]# ls -l
>total 81400
>8><-----
>-rw-rw-r-- 1 root mailman 79321195 Feb 18 08:05 error
>-rw-rw-r-- 1 root mailman 6997 Feb 13 16:42 error.1
>-rw-rw-r-- 1 root mailman 11798 Feb 7 04:02 error.2
>-rw-rw-r-- 1 root mailman 7142 Jan 31 03:19 error.3
>-rw-rw-r-- 1 root mailman 3313 Jan 24 01:54 error.4
>8><-----
>========
>
>
>This may be unrelated. Is there a
>traceback with the above error? What is it?
>
>===========
>this?
>===========
>File "/usr/lib64/python2.2/copy.py", line 186, in deepcopy
> y = copierfunction(x, memo)
> File "/usr/lib64/python2.2/copy.py", line 283, in _deepcopy_inst
> state = deepcopy(state, memo)
> File "/usr/lib64/python2.2/copy.py", line 186, in deepcopy
> y = copierfunction(x, memo)
> File "/usr/lib64/python2.2/copy.py", line 246, in _deepcopy_dict
> y[deepcopy(key, memo)] = deepcopy(x[key], memo)
> File "/usr/lib64/python2.2/copy.py", line 186, in deepcopy
> y = copierfunction(x, memo)
> File "/usr/lib64/python2.2/copy.py", line 219, in _deepcopy_list
> y.append(deepcopy(a, memo))
[...]
>RuntimeError: maximum recursion depth exceeded
>
>Feb 18 08:05:23 2010 (22075) SHUNTING: 1266281265.9402289+108ada57b8e680da231d7a75a9bf50e08bbce3fe
>[root at vuwunicosmtp004 mailman]#
Yes, that. Something is really hosed. Have you tried just restarting
Mailman?
What's at the start of that traceback leading up to the first call to
deepcopy?
This could also be a some kind of list object corruption leading to a
circular reference. Hard to say without more information.
--
Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Users
mailing list