[Mailman-Developers] Re: Slow Performance on semi-large lists

D.J. Atkinson dj@pcisys.net
Thu, 14 Dec 2000 10:23:13 -0700 (MST)


Thanks David,

From what I've seen on how Mailman's qrunner works, this would help my
situation tremendously.

As long as this is going to the developers list, what do you all think of
the possibility of adding the "filebase" to the log line of the
smtp-failure log and/or the smtp log?  I know this would increase the size
of the logs, so maybe it would be an option/flag set in the
Defaults.py/mm_cfg.py files?  This would have been very helpful in
tracking down those files that were sucking all the time out of my qrunner
jobs.

Regards,

DJ

On Wed, 13 Dec 2000, David Champion wrote:

>
>I shifted this to mailman-developers because I want to talk about
>changes in qrunner that D.J. Atkinson brought up.
>
>
>On 2000.12.13, in <Pine.SOL.4.05.10012131455360.22660-100000@babu.pcisys.net>,
>	"D.J. Atkinson" <dj@pcisys.net> wrote:
>>
>> I posted a message over the weekend where I saw qrunner only processing
>> part of the queue.  It turned out that there were three messages in the
>> queue with 3 unresolvable names each.  (3 messages to the same list)
>> Each of these queued files took 400 seconds to time out, by which time, we
>> were past the default max qrunner process length (15 minutes), and qrunner
>> exited.
>>
>> I've of course now increased the process length to 30 minutes, and
>> everything seems to be OK.  But that's only temporary, I'm sure.  As list
>> volume builds, it will become a problem again.  It would be great if there
>> were a more graceful way of dealing with this than currently exists.
>
>How about altering qrunner's algorithm to split the queue on timeout,
>appending the head of the queue to the tail?
>
>A - fails
>B - succeeds
>C - fails
>D - fails/unprocessed; qrunner times out
>E - unprocessed
>F - unprocessed
>
>With this change, your next queue runner will process this queue:
>
>E
>F
>A
>C
>D
>
>Eventually (ahem) the queue will contain only those batches which are
>hard to deliver, and they'll be re-ordered with each run to give equal
>attempts over time.
>
>Actually, that's not true if the queue is reduced to containing only A,
>C, and D, and qrunner always times out on D; D will never get the same
>time as A and C.  Leaving D at the head of the queue (that is,
>splitting the queue ahead of the current batch, rather than behind it)
>solves that problem until the case occurs in which D contains enough
>bad or slow addresses to stop the queue even though it's first.  Two
>solutions to this: 1) never stop qrunner during the first queued batch
>(always wait for it to exit); or 2) split the queue ahead or behind of
>the current batch randomly.
>
>Does this seem to anyone else to solve the problem?  I haven't looked
>at the code yet, so this is just cursory thought.
>
>--
> -D.	dgc@uchicago.edu	NSIT	University of Chicago
>

--
       o o o o o o o . . .                                  _______
      o         _____ _____        ____________________ ____] D D [_||___
   ._][__n__n___|DD[ [     \_____  |  D.J. Atkinson   | | dj@pcisys.net |
  >(____________|__|_[___________]_|__________________|_|_______________|
  _/oo OOOO OOOO oo` 'ooooo ooooo` 'o!o            o!o` 'o!o         o!o`
-+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+-
Visit my web page at http://www.pcisys.net/~dj