[Mailman-Users] mailman keeps going down

Mark Sapiro msapiro at value.net
Sun Oct 30 18:24:59 CET 2005


Manuel Kissoyan wrote:
>
>wondering where we can find any log that let us know why mailman is going down often, i did search in the logs and find :
>
>Oct 29 00:44:22 2005 (2108) Master qrunner detected subprocess exit
>(pid: 22079, sig: None, sts: 1, class: VirginRunner, slice: 1/1) [restarting]
>Oct 29 00:44:22 2005 (2108) Qrunner VirginRunner reached maximum restart limit of 10, not restarting.
>Oct 29 01:47:15 2005 (5861) RetryRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 01:47:15 2005 (5856) CommandRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 01:47:15 2005 (5858) NewsRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 01:47:15 2005 (5861) RetryRunner qrunner exiting.
>Oct 29 01:47:15 2005 (5856) CommandRunner qrunner exiting.
>Oct 29 01:47:15 2005 (5858) NewsRunner qrunner exiting.
>Oct 29 01:47:15 2005 (22117) VirginRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 01:47:15 2005 (22117) VirginRunner qrunner exiting.
>Oct 29 01:47:15 2005 (29169) OutgoingRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 01:47:15 2005 (29169) OutgoingRunner qrunner exiting.
>Oct 29 20:51:03 2005 (11801) ArchRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 20:51:03 2005 (11492) IncomingRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 20:51:03 2005 (11801) ArchRunner qrunner exiting.
>Oct 29 20:51:03 2005 (21057) BounceRunner qrunner caught SIGTERM.  Stopping.
>Oct 29 20:51:03 2005 (11492) IncomingRunner qrunner exiting.
>Oct 29 20:51:03 2005 (21057) BounceRunner qrunner exiting.
>Oct 29 20:51:03 2005 (2108) Master watcher caught SIGTERM.  Exiting.
>Oct 29 20:51:04 2005 (2108) Master qrunner detected subprocess exit
>(pid: 11492, sig: None, sts: 15, class: IncomingRunner, slice: 1/1) 
>Oct 29 20:51:04 2005 (2108) Master qrunner detected subprocess exit
>(pid: 21057, sig: None, sts: 15, class: BounceRunner, slice: 1/1) 
>Oct 29 20:51:04 2005 (2108) Master qrunner detected subprocess exit
>(pid: 11801, sig: None, sts: 15, class: ArchRunner, slice: 1/1) 
>
>
>any other way to see why this going down?

The info you found in the qrunner log is probably all there is. At
20:51:03, the master caught a SIGTERM. This could be the result of a
"bin/mailmanctl stop" command or something else. This may be the
reason the Incoming, Bounce and Arch qrunners were sent SIGTERM at
20:51:03, but probably not because normally, the "Master watcher
caught SIGTERM" entry would be first in that case. It is certainly not
the reason the other runners were sent SIGTERM at 01:47:15.

There may be entries in Mailman's error log about this, but probably
not. Look for error log entries at these times. Most likely, these
SIGTERM signals were generated externally, either automatically by the
OS because of some OS limit or other condition, or by some other
process (maybe manually invoked).

There may be further information in some OS log.

-- 
Mark Sapiro <msapiro at value.net>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan




More information about the Mailman-Users mailing list