[Mailman-Users] Further to "Need Help with Mailman Mail Delivery"

Chuck Weinstock weinstock at conjelco.com
Tue Sep 27 10:54:14 EDT 2016


Mark,

I referred to https://wiki.list.org/x/17891756 <https://wiki.list.org/x/17891756> before I even contacted the list.

All of the cgi wrappers are suid. check_perms run as root finds no problems. 

One thing I noticed is that there was no locks directory anywhere in the installation. Is this normal? (Places I looked: /var/lib/mailman /usr/lib/mailman and /etc/mailman.)

Also the /var/log/mailman/lock was empty but now shows:

Sep 27 10:51:45 2016 (12700) fttc.lock lifetime has expired, breaking
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/bin/qrunner", line 278, in <module>
Sep 27 10:51:45 2016 (12700)     main()
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/bin/qrunner", line 238, in main
Sep 27 10:51:45 2016 (12700)     qrunner.run()
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 70, in run
Sep 27 10:51:45 2016 (12700)     filecnt = self._oneloop()
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 119, in _oneloop
Sep 27 10:51:45 2016 (12700)     self._onefile(msg, msgdata)
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 190, in _onefile
Sep 27 10:51:45 2016 (12700)     keepqueued = self._dispose(mlist, msg, msgdata)
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 115, in _dispose
Sep 27 10:51:45 2016 (12700)     mlist.Lock(timeout=mm_cfg.LIST_LOCK_TIMEOUT)
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/MailList.py", line 161, in Lock
Sep 27 10:51:45 2016 (12700)     self.__lock.lock(timeout)
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/LockFile.py", line 306, in lock
Sep 27 10:51:45 2016 (12700)     important=True)
Sep 27 10:51:45 2016 (12700)   File "/usr/lib/mailman/Mailman/LockFile.py", line 416, in __writelog
Sep 27 10:51:45 2016 (12700)     traceback.print_stack(file=logf)

(Which is referencing the list in question.)

Thanks again,

Chuck

> On Sep 27, 2016, at 10:29 AM, Mark Sapiro <mark at msapiro.net> wrote:
> 
> On 09/27/2016 06:55 AM, Chuck Weinstock wrote:
>> Whoops. The reinstalled Mailman stopped working with the same problem
>> overnight. Two of the eight qrunners crashed.
>> 
>> I have 3-4 lists and one of them will not open in the web admin
>> interface. It times out as per the apache log:
>> 
>> [Tue Sep 27 09:45:53.591373 2016] [cgi:warn] [pid 2483] [client
>> 128.237.211.152:49581] AH01220: Timeout waiting for output from CGI
>> script /usr/lib/mailman/cgi-bin/admin, referer:
>> http://www.conjel.co/mailman/admin/fttc
>> [Tue Sep 27 09:45:53.592426 2016] [cgi:error] [pid 2483] [client
>> 128.237.211.152:49581] Script timed out before returning headers: admin,
>> referer: http://www.conjel.co/mailman/admin/fttc
>> [Tue Sep 27 09:46:53.639699 2016] [cgi:warn] [pid 2483] [client
>> 128.237.211.152:49581] AH01220: Timeout waiting for output from CGI
>> script /usr/lib/mailman/cgi-bin/admin, referer:
>> http://www.conjel.co/mailman/admin/fttc
>> [Tue Sep 27 09:46:53.640524 2016] [reqtimeout:info] [pid 2483] [client
>> 128.237.211.152:49581] AH01382: Request body read timeout
> 
> 
> The CGIs are timing out. This is normally caused by a locked list.
> 
> 
>> Here is the access log from the same time frame:
>> 
>> 128.237.211.152 - - [27/Sep/2016:09:44:51 -0400] "GET
>> /mailman/admin/fttc HTTP/1.1" 200 2078
>> 128.237.211.152 - - [27/Sep/2016:09:44:53 -0400] "POST
>> /mailman/admin/fttc HTTP/1.1" 504 247
>> 
>> Here is the qrunner log (from earlier when the two qrunners stopped):
>> 
>> Sep 27 06:09:59 2016 (7136) Master qrunner detected subprocess exit
>> (pid: 1194, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting]
> 
> sig: 9 is a SIGKILL. This seems to say that something external is
> killing the runner.
> 
> This is likely the same or a similar underlying cause as the CGI
> timeouts, but is different as the CGIs are independent of the qrunners.
> 
> 
>> 
>> Finally this is the only error in the Mailman error file since the
>> reinstall last night.
>> 
>> Sep 26 20:59:51 2016 admin(8885):
>> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
>> admin(8885): [----- Mailman Version: 2.1.15 -----] 
>> admin(8885): [----- Traceback ------] 
>> admin(8885): Traceback (most recent call last):
>> admin(8885):   File "/usr/lib/mailman/scripts/driver", line 112, in run_main
>> admin(8885):     main()
>> admin(8885):   File "/usr/lib/mailman/Mailman/Cgi/admindb.py", line 198,
>> in main
>> admin(8885):     mlist.Save()
>> admin(8885):   File "/usr/lib/mailman/Mailman/MailList.py", line 578, in
>> Save
>> admin(8885):     self.__save(dict)
>> admin(8885):   File "/usr/lib/mailman/Mailman/MailList.py", line 555, in
>> __save
>> admin(8885):     os.link(fname, fname_last)
>> admin(8885): OSError: [Errno 1] Operation not permitted
> 
> 
> This is a permission or security manager (SELinux, apparmor, ?) issue.
> 
> First try running Mailman's 'bin/check_perms -f` as root. If that fixes
> things, it may help. Also, see <https://wiki.list.org/x/17891756>.
> 
> Note that Mailman's CGI wrappers must be group mailman and SETGID. In
> particular, these files must not be on a file system mounted with 'nosuid'.
> 
> If none of this helps, try disabling SELinux.
> 
> The qrunners being SIGKILLed is still a bit mysterious, but that could
> be related to a permissions or SELinux issue.
> 
> -- 
> Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
> San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list