Newbie tip: Prg speed from ~5 hrs to < 1 minute

John Machin sjmachin at lexicon.net
Mon Apr 29 18:06:14 EDT 2002


TerryByrne1963 at yahoo.com (Terry Byrne) wrote in message news:<93d52e82.0204290618.349a0e5f at posting.google.com>...

[big snip]
> 
> At first I was just running the re.search on each line. Time: ~ 5 hrs
> for 60K line log file. Started testing with the string.find method to
> make sure a certain error code appeared on the line before running the
> "expensive" re.search, and that cut execution time from ~5 hrs to ~
> 1.5 hrs, good but "no cigar," as they say.
> 
> Then I realized that each line of content could be treated the same as
> you'd treat a switch() structure in C, or a case structure in Pascal:
> if one option is a "hit", then you needn't even check for all the
> other options. So I started pass-ing to the next line of the content
> whenever I got a "hit."

Please re-read your source of information on the pass statement.
"pass" does nothing. Zip. Zilch. It is only a place holder for cases
where you want to do nix but Python syntax insists on a statement
being present. Here are a couple of use cases:

class FubarError(Exception):
   pass

if a and b or c and d:
   pass
else:
   # Late at night, don't want to try inverting
   # the above in my head, don't like not (...)
   do_something()

You can experiment with pass's 'functionality', by putting something
like the following in a file and running it from the command-line:

for i in range(5):
   print i
   pass
   print 'Still here!'
   pass
   print 'Persistent little varmint ...'
   pass
   print 'Now do you believe Uncle John?'

Then change all occurences of 'pass' to '# pass' and see what happens.
Then change all the occurrences of '# pass' to 'continue' and see what
happens.

> That cut my time down to under a minute for a
> 60K-line log file.

Let's have a look at the following snippet of your code:

>      idx = aLine.find('Error Validator:2420')
>      if idx > -1:
>        LINKS_VNOQUERYHITS += 1
>        aLine += '<br>'
>        rfHandle.write(aLine)
>        aLine = None 
>        idx = -1
>        pass
>      idx = aLine.find('Error Validator:2412')
>      if idx > -1:

This just cannot possibly be working [or maybe this code isn't even
executed at all!!!]. You bind aLine to None then attempt to execute
idx = aLine.find('...') which will cause an exception:

AttributeError: 'NoneType' object has no attribute 'find'

... unless of course previously you have done

None = ""

which needless(?) to say is something you should *never* do.

If you can't find out why this is happening yourself, it's time to
head for the  tutor list.

All those lines which bind idx to -1 are pointless. Try changing all
of them firstly to idx = -99999999 and secondly to idx = 99999999. If
either of those makes any difference at all (except a tiny increase in
the run-time) then something drastic has happened: Python is stuffed,
your copy of Python is stuffed, ...

HTH,
John



More information about the Python-list mailing list