[Mailman-Developers] Listing Lists Faster in 2.0?
Ted Cabeen
secabeen@pobox.com
Tue, 04 Apr 2000 09:46:43 -0500
In message <38E9F8D2.10F251D5@uchicago.edu>, Roberto Ullfig writes:
>Roberto Ullfig wrote:
>>
>> "Barry A. Warsaw" wrote:
>> >
>> > >>>>> "RU" == Roberto Ullfig <rullfig@uchicago.edu> writes:
>> >
>> > RU> So, in 1.0rc2, displaying the list of lists for 529 lists
>> > RU> requires 529**2 = 279841 system stat calls and takes over one
>> > RU> and a half minutes on our Ultra-2 2x296 processor system! Is
>> > RU> this because of Python, Mailman, or both? Has this been
>> > RU> "fixed" in 2.0? You really should only need to make one stat
>> > RU> call per list.
>> >
>> > Uh, it's because of Mailman :)
>> >
>> > I implemented a list_lists scripts which does on the command line what
>> > listinfo.py does in HTML (see attached). Here's what truss -c gives
>> > me:
<Snip big truss output>
>> > Getting the list of list names, requires at least a listdir() and an
>> > exists() for every directory found there.
>> >
>> > Nothing about this will change for 2.0.
>> >
>> > -Barry
>>
>> Thanks for the script.
>>
>> Now this is the truss output for the listinfo that is called by
>> driver:
<Snip more truss output>
Here's what is happening. When listinfo runs and has to get the list of
advertised addresses, it starts by getting a list of mailing lists on the
server using Utils.list_names(). Utils.list_names() requires two stat calls
for each list on the machine every time it is called. This is
understandable and isn't going to change. It then proceeds to open every one
of those lists to check on the advertised flag. Again, no problem. The
problem is that when Mailman opens a List in the MailList constructor
__init__, it checks to makes sure that the list exists by running
Utils.list_names() and seeing if the list name requested is there. Therefore
every request to open a list requires two stat calls on every list on the
system. Therefore when we are sequentially opening every list on the system
in listinfo.py we get a squaring effect on stat calls in the list directory.
Solutions are reasonably easy to code, the first of which comes to mind is a
optional argument to the constructor that indicated that the name has already
been checked and that checking it again is not necessary. Other solutions
include caching the list of lists on the server, but this means there is a
delay between when the list is created and when it becomes accessible.
I can code something up for you if necessary, but it seems like a reasonably
simple patch. Do either you or Barry need a patch? Let me know if you do.
--
Ted Cabeen http://www.pobox.com/~secabeen secabeen@pobox.com
Check Website or finger for PGP Public Key secabeen@midway.uchicago.edu
"I have taken all knowledge to be my province." -F. Bacon cococabeen@aol.com
"Human kind cannot bear very much reality."-T.S.Eliot 73126.626@compuserve.com