[Mailman-Users] Migration from Majordomo

Andreas Kostyrka andreas at mtg.co.at
Fri Nov 26 13:46:03 CET 1999


On Fri, Nov 26, 1999 at 11:34:06AM +0100, Hrvoje Niksic wrote:
> claw at kanga.nu writes:
> 
> > On 25 Nov 1999 10:13:09 +0100 
> > Hrvoje Niksic <hniksic at iskon.hr> wrote:
> > 
> > > Speed also sounds like a potential problem.  Python is a great
> > > language, but the current implementation is horribly slow, even
> > > compared to other interpreters.  
> > 
> > You may wish to re-examine this factoid.
> 
> Sorry; I really didn't mean to sound so harsh.
> 
> I like Python _a lot_; however several times I got burnt with the
> inefficiency of its current implementation, esp. compared to other
> advanced interpreters like CLISP or Perl.  If you are interested, we

Well, this is simply not true anymore. Python 1.3 was slow. 1.4 was
better. 1.5 is microtuned to a level where Guido refuses patches to
microtune it further. Having deployed vertical applications with
customers since the 1.3 days, I can just confirm that since 1.5 the
performance of Python is not a real problem anymore.

Another thing is, that one must know the strengths and weaknesses of
the language. Python still doesn't like method/function invokations,
and adding to a huge string directly is rather inefficient, as it is
true for almost any language with readonly strings. And yes, this
weaknesses one usually learns the hard way.

> can discuss my findings in private email, as it's clearly off-topic on
> this list.
> 
> When I employ a Python script with large datasets, I feel I have to
> wait for my computer rather than the other way around.  This makes me
Well, when you employ any program with a large dataset, you're waiting.

My experience shows, that it's always the algorithm and really seldom
the language that make solutions slow. (Well, certain operations are
slow in interpreted languages, and if you need just these in a inner
loop, ...)

Actually, for the context of Mailman, Python is as such probably
faster than perl, as it has a faster startup time then perl.  That
makes perl almost unusuable for small tasks, or for example simple CGI
scripts. (You need mod_perl much more for example than mod_py ;) )

> wonder about using Python for "heavy" stuff like large mailing lists.

Well, I've done a text retrieval system quick&dirty in python. I've
got down to three performance "problems":

-) HTML parsing: This is slow because I just stuck the standard python
   HTML formatter and process the result as ascii text. Deriving my
   own SGML parser that just parses the text and ignores the tags
   would be probably somehow faster. But for a quick and dirty
   solution I've deemed.
-) During the database merge phase, gdbm and/or the disc are the
   bottleneck. So python can drive my SCSI RAID array to full usage ;)
-) update/del operations are a bit slow, but that is a design failure
   in my database scheme.

So basically, I've done a full text database, and python stood up to
indexing 75000 files, containing about 1.8GB of text.

The above mentioned performance bottlenecks could be probably solved
by better programming, and not using gdbm. (The update capability of
the database was important to me.)

Andreas
-- 
Andreas Kostyrka                     | andreas at mtg.co.at
phone: +43/1/7070750                 | phone: +43/676/4091256   
MTG Handelsges.m.b.H.                | fax:   +43/1/7065299
Raiffeisenstr. 16/9                  | 2320 Zwoelfaxing AUSTRIA        





More information about the Mailman-Users mailing list