Help with script with performance problems

Aahz aahz at pythoncraft.com
Sun Nov 23 14:18:05 EST 2003


In article <a938c44d.0311222335.79a7a545 at posting.google.com>,
Dennis Roberts <googlegroups at spacerodent.org> wrote:
>
>I have a script to parse a dns querylog and generate some statistics. 
>For a 750MB file a perl script using the same methods (splits) can
>parse the file in 3 minutes.  My python script takes 25 minutes.  It
>is enough of a difference that unless I can figure out what I did
>wrong or a better way of doing it I might not be able to use python
>(since most of what I do is parsing various logs).  The main reason to
>try python is I had to look at some early scripts I wrote in perl and
>had no idea what the hell I was thinking or what the script even did! 
>After some googling and reading Eric Raymonds essay on python I jumped
>in:)  Here is my script.  I am looking for constructive comments -
>please don't bash my newbie code.

If you haven't yet, make sure you upgrade to Python 2.3; there are a lot
of speed enhancements.  Also, it allows you to switch to idioms that work
more like Perl's:

    for line in f:
        fields = line.split()
        ...

Generally speaking, contrary to what another poster suggested, string
methods will almost always be faster than regexes (assuming that a
string method does what you want directly, of course; using multiple
string methods may or may not be faster than regexes).
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.




More information about the Python-list mailing list