[Mailman-Users] Messages got stuck in "in" queue due to one "bad" message

Xueshan Feng sfeng at stanford.edu
Tue Feb 20 23:41:21 CET 2007


Hello,

We are running mailman 2.1.9. Recently we had a problem that mailman's
Incoming qrunner died hard, which caused messages accumulated under
qfiles/in directory.  Restarting mailman didn't help. We finally
identified a message on top of the queue, moved it aside, restarted
mailman again, then mail will start flow. 

Putting that message back to the in queue will trigger the problem. The
sender actually tried to send it again, and it caused the same problem.

Before I got into the details of mailman logs, I'd like to ask:

1. What's the best way to detect "stalled" condition like this? We do
monitor qrunner processes but in this case it didn't help because the
qrunners were still running. Now I put in a script to monitor the number
of files in "in" queue, and if it reaches a threshold, sends alert. 

2. Why this message didn't get moved to shunt directory?

3. Can the mailman recover itself without human intervention? 

Details are followed:

The following error message will repeat a couple of times in error log
when this message was processed: 


Feb 20 07:55:53 2007 qrunner(25722): Traceback (most recent call last):
Feb 20 07:55:53 2007 qrunner(25722):   File
"/var/lib/mailman/bin/qrunner", line 278, in ?
Feb 20 07:55:53 2007 qrunner(25722):      main()
Feb 20 07:55:53 2007 qrunner(25722):   File
"/var/lib/mailman/bin/qrunner", line 238, in main
Feb 20 07:55:53 2007 qrunner(25722):      qrunner.run()
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/mailman/Mailman/Queue/Runner.py", line 71, in run
Feb 20 07:55:53 2007 qrunner(25722):      filecnt = self._oneloop()
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/mailman/Mailman/Queue/Runner.py", line 100, in _oneloop
Feb 20 07:55:53 2007 qrunner(25722):      msg, msgdata =
self._switchboard.dequeue(filebase)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/mailman/Mailman/Queue/Switchboard.py", line 164, in dequeue
Feb 20 07:55:53 2007 qrunner(25722):      msg =
email.message_from_string(msg, Message.Message)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/__init__.py", line 52, in message_from_string
Feb 20 07:55:53 2007 qrunner(25722):      return Parser(_class,
strict=strict).parsestr(s)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Parser.py", line 75, in parsestr
Feb 20 07:55:53 2007 qrunner(25722):      return
self.parse(StringIO(text), headersonly=headersonly)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Parser.py", line 64, in parse
Feb 20 07:55:53 2007 qrunner(25722):      self._parsebody(root, fp,
firstbodyline)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Parser.py", line 240, in _parsebody
Feb 20 07:55:53 2007 qrunner(25722):      msgobj = self.parsestr(part)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Parser.py", line 75, in parsestr
Feb 20 07:55:53 2007 qrunner(25722):      return
self.parse(StringIO(text), headersonly=headersonly)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Parser.py", line 64, in parse
Feb 20 07:55:53 2007 qrunner(25722):      self._parsebody(root, fp,
firstbodyline)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Parser.py", line 146, in _parsebody
Feb 20 07:55:53 2007 qrunner(25722):      boundary =
container.get_boundary()
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Message.py", line 743, in get_boundary
Feb 20 07:55:53 2007 qrunner(25722):      boundary =
self.get_param('boundary', missing)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Message.py", line 608, in get_param
Feb 20 07:55:53 2007 qrunner(25722):      for k, v in
self._get_params_preserve(failobj, header):
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Message.py", line 555, in _get_params_preserve
Feb 20 07:55:53 2007 qrunner(25722):      params =
Utils.decode_params(params)
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Utils.py", line 337, in decode_params
Feb 20 07:55:53 2007 qrunner(25722):      charset, language, value =
decode_rfc2231(EMPTYSTRING.join(value))
Feb 20 07:55:53 2007 qrunner(25722):   File
"/usr/lib/python2.3/email/Utils.py", line 284, in decode_rfc2231
Feb 20 07:55:53 2007 qrunner(25722):      charset, language, s = parts
Feb 20 07:55:53 2007 qrunner(25722): ValueError :  unpack list of wrong
size

The qrunner log shows:
Feb 20 07:55:52 2007 (27933) Master qrunner detected subprocess exit
(pid: 27941, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:53 2007 (25722) IncomingRunner qrunner started.
Feb 20 07:55:53 2007 (27933) Master qrunner detected subprocess exit
(pid: 25722, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:53 2007 (25723) IncomingRunner qrunner started.
Feb 20 07:55:53 2007 (27933) Master qrunner detected subprocess exit
(pid: 25723, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:53 2007 (25724) IncomingRunner qrunner started.
Feb 20 07:55:53 2007 (27933) Master qrunner detected subprocess exit
(pid: 25724, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:53 2007 (25725) IncomingRunner qrunner started.
Feb 20 07:55:53 2007 (27933) Master qrunner detected subprocess exit
(pid: 25725, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:53 2007 (25726) IncomingRunner qrunner started.
Feb 20 07:55:54 2007 (27933) Master qrunner detected subprocess exit
(pid: 25726, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:54 2007 (25727) IncomingRunner qrunner started.
Feb 20 07:55:54 2007 (27933) Master qrunner detected subprocess exit
(pid: 25727, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:54 2007 (25728) IncomingRunner qrunner started.
Feb 20 07:55:54 2007 (27933) Master qrunner detected subprocess exit
(pid: 25728, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:54 2007 (25729) IncomingRunner qrunner started.
Feb 20 07:55:54 2007 (27933) Master qrunner detected subprocess exit
(pid: 25729, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:54 2007 (25731) IncomingRunner qrunner started.
Feb 20 07:55:55 2007 (27933) Master qrunner detected subprocess exit
(pid: 25731, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]
Feb 20 07:55:55 2007 (25734) IncomingRunner qrunner started.
Feb 20 07:55:55 2007 (27933) Master qrunner detected subprocess exit
(pid: 25734, sig: None, sts: 1, class: IncomingRunner, slice: 1/4)
[restarting]


Xueshan
--




More information about the Mailman-Users mailing list