[Mailman-Users] Troublesome Mailman Errors

George Booth G.Booth at usm.edu
Wed Jun 20 20:36:39 CEST 2007


Mark Sapiro wrote:

>There is a VERY large message in the lists/<listname>/digest.mbox file
>for the list that is being posted to at the time that these errors
>occur.

>This is causing Mailman (IncomingRunner) to grow large in parsing this
>message to the point it is denied additional memory by the OS.

>Find the offending list and either move the digest.mbox aside or edit
>the file and remove the huge message. You should be able to find the
>digest.mbox by just looking for the huge one.

There was indeed a very large digest.mbox for one of the lists (not for the
one we've been having issues with, though), so I removed the digest.mbox
(that list is set up to be used as a departmental email address, so there
are only 2 recipients on the list and they weren't digesting anyway), and
that removed the MemoryErrors I've been seeing. BTW, I should mention that
my version of Mailman is 2.1.5

However, now I'm getting different errors. When I restarted Mailman after
removing the digest.mbox file, the errors were:

Jun 20 05:36:40 2007 (28020) Cannot connect to SMTP server localhost on port
smtp
Jun 20 06:00:13 2007 (20879) Cannot connect to SMTP server localhost on port
smtp
Jun 20 08:02:55 2007 (23371) Cannot connect to SMTP server localhost on port
smtp

Each instance above is after a restart of the Mailman service.

The server admins checked smtp on the box, did somethings that they didn't
share with me, and now, I'm getting:

Jun 20 09:17:20 2007 (31631) Uncaught runner exception: SMTP instance has no
attribute 'sock'
Jun 20 09:17:20 2007 (31631) Traceback (most recent call last):
  File "/var/mailman/Mailman/Queue/Runner.py", line 111, in _oneloop
    self._onefile(msg, msgdata)
  File "/var/mailman/Mailman/Queue/Runner.py", line 167, in _onefile
    keepqueued = self._dispose(mlist, msg, msgdata)
  File "/var/mailman/Mailman/Queue/OutgoingRunner.py", line 73, in _dispose
    self._func(mlist, msg, msgdata)
  File "/var/mailman/Mailman/Handlers/SMTPDirect.py", line 163, in process
    conn.quit()
  File "/var/mailman/Mailman/Handlers/SMTPDirect.py", line 90, in quit
    self.__conn.quit()
  File "/usr/lib/python2.3/smtplib.py", line 708, in quit
    self.docmd("quit")
  File "/usr/lib/python2.3/smtplib.py", line 369, in docmd
    self.putcmd(cmd,args)
  File "/usr/lib/python2.3/smtplib.py", line 325, in putcmd
    self.send(str)
  File "/usr/lib/python2.3/smtplib.py", line 310, in send
    if self.sock:
AttributeError: SMTP instance has no attribute 'sock'

Jun 20 09:17:20 2007 (31631) SHUNTING:
1182344407.694294+91d21113f1e46e0b450308adbe350c1c50faeff1

At this point, I now have 1500+ emails in the outgoing queue, which won't go
anywhere. I've done more checking online and found several things to try,
including testing Python's smtp connection, and changed the SMTPHOST from
'localhost' to the actual IP of the box in mm_cfg.py, just in case, with no
change.

My mailman/smtp log shows some messages going through:

Jun 20 09:41:03 2007 (15107) <200706201149.l5KBnHH0029291 at mail.usm.edu> smtp
for 2 recips, completed in 0.009 seconds
Jun 20 09:41:03 2007 (15107) <200706201150.l5KBoNUw006869 at auxprd.usm.edu>
smtp for 6 recips, completed in 0.005 seconds
Jun 20 09:41:03 2007 (15107) <mailman.9.1182340507.23369.sharecare at usm.edu>
smtp for 1 recips, completed in 0.003 seconds
Jun 20 09:41:03 2007 (15107) <mailman.10.1182340507.23369.sharecare at usm.edu>
smtp for 1 recips, completed in 0.009 seconds
Jun 20 09:41:03 2007 (15107) <mailman.11.1182340592.23369.rutabaga at usm.edu>
smtp for 1 recips, completed in 0.003 seconds
Jun 20 09:41:03 2007 (15107) <mailman.12.1182340592.23369.rutabaga at usm.edu>
smtp for 1 recips, completed in 0.008 seconds
Jun 20 09:41:06 2007 (15547) <200706201200.l5KC00cm000949 at www.usm.edu> smtp
for 1 recips, completed in 0.733 seconds
Jun 20 11:10:39 2007 (28494) <001601c25c6a$f7e82470$001ad6b4 at casa9cfb6824b4>
smtp for 3 recips, completed in 3599.674 seconds
Jun 20 12:10:39 2007 (28494) <001501c7b31a$e9beb6a0$000a4dd4 at ntserver> smtp
for 1 recips, completed in 3599.660 seconds
Jun 20 13:10:39 2007 (28494)
<2154.82.128.1.35.1181656670.squirrel at 72.36.213.90> smtp for 1 recips,
completed in 3599.643 seconds

But my post log shows failures:

Jun 20 09:41:03 2007 (15107) post to george from mail at usm.edu, size=74664,
message-id=<200706201149.l5KBnHH0029291 at mail.usm.edu>, 2 failures
Jun 20 09:41:03 2007 (15107) post to appsupport from appsup at auxprd.usm.edu,
size=2249, message-id=<200706201150.l5KBoNUw006869 at auxprd.usm.edu>, 6
failures
Jun 20 09:41:03 2007 (15107) post to sharecare from
sharecare-bounces at usm.edu, size=1010,
message-id=<mailman.9.1182340507.23369.sharecare at usm.edu>, 1 failures
Jun 20 09:41:03 2007 (15107) post to sharecare from sharecare-owner at usm.edu,
size=5000, message-id=<mailman.10.1182340507.23369.sharecare at usm.edu>, 1
failures
Jun 20 09:41:03 2007 (15107) post to rutabaga from rutabaga-bounces at usm.edu,
size=1024, message-id=<mailman.11.1182340592.23369.rutabaga at usm.edu>, 1
failures
Jun 20 09:41:03 2007 (15107) post to rutabaga from rutabaga-owner at usm.edu,
size=3794, message-id=<mailman.12.1182340592.23369.rutabaga at usm.edu>, 1
failures
Jun 20 09:41:06 2007 (15547) post to system-reports from root at www.usm.edu,
size=2264, message-id=<200706201200.l5KC00cm000949 at www.usm.edu>, 1 failures
Jun 20 11:10:39 2007 (28494) post to itc from dlamongst at link2rewards.com,
size=2524, message-id=<001601c25c6a$f7e82470$001ad6b4 at casa9cfb6824b4>, 3
failures
Jun 20 12:10:39 2007 (28494) post to hr from scredential at adore.dk,
size=3975, message-id=<001501c7b31a$e9beb6a0$000a4dd4 at ntserver>, 1 failures
Jun 20 13:10:39 2007 (28494) post to med.relations from
edward_moore22 at yahoo.co.uk, size=4429,
message-id=<2154.82.128.1.35.1181656670.squirrel at 72.36.213.90>, 1 failures

And not that many attempts, based on the number of messages waiting in the
outgoing queue.

The smtp-failure log doesn't show anything useful that I can see:

Jun 20 09:59:57 2007 (23726) Low level smtp error: [Errno 32] Broken pipe,
msgid: <001601c25c6a$f7e82470$001ad6b4 at casa9cfb6824b4>
Jun 20 10:10:40 2007 (28494) Low level smtp error: [Errno 32] Broken pipe,
msgid: <001601c25c6a$f7e82470$001ad6b4 at casa9cfb6824b4>
Jun 20 11:10:39 2007 (28494) delivery to Steven.Blesse at usm.edu failed with
code -1: [Errno 32] Broken pipe
Jun 20 11:10:39 2007 (28494) delivery to Shelton.Houston at usm.edu failed with
code -1: [Errno 32] Broken pipe
Jun 20 11:10:39 2007 (28494) delivery to Christopher.Herrod at usm.edu failed
with code -1: [Errno 32] Broken pipe
Jun 20 11:10:40 2007 (28494) Low level smtp error: [Errno 32] Broken pipe,
msgid: <001501c7b31a$e9beb6a0$000a4dd4 at ntserver>
Jun 20 12:10:39 2007 (28494) delivery to Amy.Byxbe at usm.edu failed with code
-1: [Errno 32] Broken pipe
Jun 20 12:10:39 2007 (28494) Low level smtp error: [Errno 32] Broken pipe,
msgid: <2154.82.128.1.35.1181656670.squirrel at 72.36.213.90>
Jun 20 13:10:39 2007 (28494) delivery to Tyshuna.Dyess at usm.edu failed with
code -1: [Errno 32] Broken pipe
Jun 20 13:10:39 2007 (28494) Low level smtp error: [Errno 32] Broken pipe,
msgid: <200706200100.l5K101Nx032748 at auxprd.usm.edu>

At this point, I'm at a loss. I've gone through the steps outline in the FAQ
entry "4.78. Troubleshooting: No mail going out to lists members", and all
seems well there. Only other thing I can think of is that under my
/var/mailman/locks directory, I have the following:

[root at mail1 locks]# ls -al
total 8
drwxrwsr-x   2 root    mailman 4096 Jun 20 13:30 .
drwxrwsr-x  21 root    mailman 4096 Jun 19 20:02 ..
-rw-rw-r--   2 mailman mailman   53 Jun 21  2007 master-qrunner
-rw-rw-r--   2 mailman mailman   53 Jun 21  2007
master-qrunner.mail1.usm.edu.28475
-rw-rw-r--   1 mailman mailman   53 Jan 18  2006
master-qrunner.mail.usm.edu.1757.1
[root at mail1 locks]#

I don't know if these are required or if I should delete them, or if they
even have anything whatsoever to do with the issue at hand. I did find it
odd that two of them seem to be dated as June 21, 2007, which is tomorrow
(assuming I read that correctly), although a check of the "date" command on
the server shows the correct date and time.

Once again, any help would be greatly appreciated.

Thanks,

George Booth

<>-<>-<>-<>-<>-<>-<>-<>-<>
George Booth     iTech Applications Administrator
G.Booth at usm.edu
University of Southern Mississippi





More information about the Mailman-Users mailing list