[Bug 1832740] [NEW] init script / mailmanctl fails to stop mailman 2, reports success

Ian Kelling ian at iankelling.org
Thu Jun 13 12:10:03 EDT 2019


Public bug reported:

# systemctl restart mailman

Jun 13 11:43:27 lists.gnu.org systemd[1]: Stopping LSB: Mailman Master Queue Runner...
Jun 13 11:43:27 lists.gnu.org mailman[31096]:  * Stopping Mailman master qrunner mailmanctl
Jun 13 11:43:27 lists.gnu.org systemd[1]: Stopped LSB: Mailman Master Queue Runner.
Jun 13 11:43:28 lists.gnu.org mailman[31096]:    ...done.
Jun 13 11:43:27 lists.gnu.org systemd[1]: Starting LSB: Mailman Master Queue Runner...
Jun 13 11:43:31 lists.gnu.org mailman[31153]:  * Starting Mailman master qrunner mailmanctl
Jun 13 11:43:31 lists.gnu.org mailman[31153]: The master qrunner lock could not be acquired because it appears as if another
Jun 13 11:43:31 lists.gnu.org mailman[31153]: master qrunner is already running.
Jun 13 11:43:31 lists.gnu.org mailman[31153]:    ...done.

At this point, ps -ef | grep mailman shows 4 mailman processes remain:

/usr/bin/python /usr/lib/mailman/bin/mailmanctl -s -q start
and 3 qrunners, like this
/usr/bin/python /var/lib/mailman/bin/qrunner --runner=OutgoingRunner:1:4 -s

The qrunner log does show all the pids getting the TERM signal from mailmanctl:
Jun 13 11:43:27 2019 (21946) OutgoingRunner qrunner caught SIGTERM.  Stopping.

But only 1 actually stopped. I manually send the qrunners kill signals over and over and
wait until 5 minutes later, they finally terminate and mailmanctl with them. 
Then I run  systemctl restart mailman again, and it really starts this time:


Jun 13 11:48:51 lists.gnu.org systemd[1]: Stopping LSB: Mailman Master Queue Runner...
Jun 13 11:48:51 lists.gnu.org mailman[10762]:  * Stopping Mailman master qrunner mailmanctl
Jun 13 11:48:51 lists.gnu.org mailman[10762]: PID unreadable in: /var/run/mailman/mailman.pid
Jun 13 11:48:51 lists.gnu.org mailman[10762]: [Errno 2] No such file or directory: '/var/run/mailman/mailman.pid'
Jun 13 11:48:51 lists.gnu.org mailman[10762]: Is qrunner even running?
Jun 13 11:48:51 lists.gnu.org mailman[10762]:    ...done.
Jun 13 11:48:51 lists.gnu.org systemd[1]: Stopped LSB: Mailman Master Queue Runner.
Jun 13 11:48:51 lists.gnu.org systemd[1]: Starting LSB: Mailman Master Queue Runner...
Jun 13 11:48:55 lists.gnu.org mailman[10775]:  * Starting Mailman master qrunner mailmanctl
Jun 13 11:48:55 lists.gnu.org mailman[10775]:    ...done.
Jun 13 11:48:55 lists.gnu.org systemd[1]: Started LSB: Mailman Master Queue Runner


I'm using mailman 2.1.23-1+deb9u4+8.0trisquel1 on trisquel 8, which has Python 2.7.12.

I really need to figure out a fix or workaround to this bug, waiting 5 minutes to
restart mailman is no good, I run a lot of very active lists on lists.gnu.org.
Can I kill -9? Can I start the mailman while the old qrunners are still exiting?
How can I help debug this to find a fix?

** Affects: mailman
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Mailman
Coders, which is subscribed to GNU Mailman.
https://bugs.launchpad.net/bugs/1832740

Title:
  init script / mailmanctl fails to stop mailman 2, reports success

To manage notifications about this bug go to:
https://bugs.launchpad.net/mailman/+bug/1832740/+subscriptions


More information about the Mailman-coders mailing list