[Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)
Richard Barrett
r.barrett at openinfo.co.uk
Mon Nov 3 23:39:42 CET 2003
Scott
Further to my earlier post on this topic, I have taken a look at the
pipermail archiver code.
I concluded that there is a bug (or is it a feature?) which bloats the
size of the -article file in the pipermail "database" for each list.
This bloat will affect archiving performance, particularly for list
with large amounts of traffic and/or those that have large text
postings to them.
I think the bug has been around for a number of releases and it would
explain why I had previously found shortening the archive period
improved matters.
This may or may not be part of the problem you reported. I have posted
a patch to correct this problem here which you might like to try if you
are feeling particularly brave:
http://www.openinfo.co.uk/mailman/patches/835332/index.html
and here:
http://sourceforge.net/tracker/
?func=detail&aid=835332&group_id=103&atid=100103
Feedback either +ve or -ve would be appreciated if you try the patch.
Richard
On Friday, October 31, 2003, at 08:52 pm, Scott Lambert wrote:
> On Fri, Oct 31, 2003 at 09:40:11AM -0500, Jon Carnes wrote:
>> On Fri, 2003-10-31 at 09:26, Jay West wrote:
>>> I'm using Mailman 2.1.2 on FreeBSD v4.8-Release, built using the
>>> port. MTA
>>> is sendmail 8.12.8p1
>>>
>>> Very frequently I will see the ArchRunner process using 99+ % of
>>> cpu. I have
>>> searched the archives and found lots of messages about qrunners
>>> using large
>>> percentages of cpu, but they all seem to talk about the fixes being
>>> related
>>> to actual mail processing (sendmail), not archRunner. I am assuming
>>> that if
>>> the problem was mail delivery or reception I would be seeing the
>>> large cpu
>>> use on a different qrunner process. My issue is specific to the
>>> archrunner
>>> process which I don't find much on in the archives/faq.
>>>
>> Well you've pegged it. That was a bug in version 2.1.2 which is fixed
>> in 2.1.3. The patch for 2.1.2 should still be available - you could
>> probably patch your running system and just leave it at that (an
>> upgrade
>> will bring the patch in anyway).
>
> I still see this problem with Mailman 2.1.3 for a high-volume list.
>
> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU
> COMMAND
> 66428 mailman 64 0 168M 147M CPU1 0 376.7H 99.02% 99.02%
> python2.3
>
> That's the archiver process. There are 1318 messages in the archive
> queue...
>
> 12:00:28 Fri Oct 31 # truss -p 66428
> break(0x114f6000) = 0 (0x0)
> break(0x1302c000) = 0 (0x0)
> break(0x114f8000) = 0 (0x0)
> break(0x13030000) = 0 (0x0)
> break(0x114fa000) = 0 (0x0)
> break(0x13034000) = 0 (0x0)
> break(0x114fc000) = 0 (0x0)
> break(0x13038000) = 0 (0x0)
> break(0x114fe000) = 0 (0x0)
> break(0x1303c000) = 0 (0x0)
> break(0x11500000) = 0 (0x0)
> break(0x13040000) = 0 (0x0)
> break(0x11502000) = 0 (0x0)
> break(0x13044000) = 0 (0x0)
> break(0x11504000) = 0 (0x0)
> break(0x13048000) = 0 (0x0)
> break(0x11506000) = 0 (0x0)
> break(0x1304c000) = 0 (0x0)
>
> Once I kill off the mailman queue runners and clean up the several lock
> files for this mailing list, it runs just fine and manages to empty the
> archive queue.
>
> Two days worth of mailman cron jobs were still stuck in the process
> list.
>
> Supposition: Maybe they were blocked by the list's lockfile?
>
> So, it seems that the archRunner process went off the deep end
> somewhere
> between two and three days ago.
>
> I have the htdig patches for 2.1.3 installed. Which might be
> germane...
>
> --
> Scott Lambert KC5MLE Unix
> SysAdmin
> lambert at lambertfam.org
>
>
-----------------------------------------------------------------------
Richard Barrett http://www.openinfo.co.uk
More information about the Mailman-Users
mailing list