.readline() - VERY SLOW compared to PERL
Duncan Booth
duncan at rcp.co.uk
Tue Nov 21 06:48:32 EST 2000
h_schneider at marketmix.com (Harald Schneider) wrote in
<8vd94d$mjo$06$1 at news.t-online.com>:
>Thanks for your reply.
>
>The alternate methods posted here (readlines with chunks) cut down the
>weak perfomance to nearly the
>results, you posted.
>
>For everyone interested, here are the scripts used for testing:
>
>
Just for interest, I wondered what difference, if any, does it make to the
Python scripts you posted if you put everything inside a function so that
it uses local variables in place of the global variables?
I tried this and, although the times fluctuate substantially on different
runs, I got your 5.6 second script running on my test file (48Mb) in 12.66
seconds. My version with local variables took 10.12 seconds.
Not a major difference, but possibly worthwhile. Interestingly, most of the
speedup comes from using a local variable for string.split. Without this
optimisation it takes about 12.06 seconds. Also note that I used the -O
command line option as otherwise the times were all about 5 seconds slower.
Here is my version:
======================================
import sys, string, time
def run():
print "Running..."
dbname = 'test.dat'
secStart = time.time()
db = open(dbname, 'r')
read = db.readlines
split = string.split
while 1:
lines = read(250000)
if not lines:
break
for dbline in lines:
rs = split(dbline, ';')
if rs[0] == 'TEST':
print dbline + "\n"
break
print "DONE!\n"
db.close
print "Elapsed time: %f sec." % (time.time() - secStart)
if __name__=='__main__':
run()
========================================
More information about the Python-list
mailing list