speed problems

Thu Jun 3 10:45:09 EDT 2004

On 2004-06-03, ^ <axel at axel.truedestiny.net> wrote:
> Here are both scripts, could you please have a look and tell me where I
> should look for optimizations?

    Well, I see one major difference and one place I'd do something
differently.

Perl:
> my ($gzip) = "/usr/bin/gzip";
> my ($bzip2)= "/usr/bin/bzip2";

    First off you're using exernal programs here for decompression.  This is a
trade off of making a system call vs internal implementation.  Maybe Python's
implementation is slower?  I don't know, just pointing out that it is a
difference.  Personally when programming tools like this I try to keep
everything internal because I've had endless system calls kill the run-time.
However with the few files you're iterating over the cost might be the other
way 'round.  :)

Python:
>     for line in lf.readlines():
>       if string.count( line, "INFECTED" ):
>         vname = re.compile( "INFECTED \((.*)\)" ).search( line ).group(1)

    If I read this correctly you're compiling this regex every time you're
going through the for loop.  So every line the regex is compiled again.  You
might want to compile the regex outside the loop and only use the compiled
version inside the loop.  

    I *think* that Perl caches compiled regexs which is why they don't have
two different ways of calling the regex while Python, in giving two different
calls to the regex, will compile it every time if you expressedly call for a
compile.  Again, just a guess based on how I presume the languages work and
how I'd write them differently.

-- 
         Steve C. Lamb         | I'm your priest, I'm your shrink, I'm your
       PGP Key: 8B6E99C5       | main connection to the switchboard of souls.
-------------------------------+---------------------------------------------