[Tutor] Fastest way to iterate through a file

Jason Massey jason.massey at gmail.com
Tue Jun 26 17:55:33 CEST 2007


Also since you're writing your found results to a file there's no need to
print the results to the screen.  That should shave off some time,
especially if you have a lot of hits.

On 6/26/07, Kent Johnson <kent37 at tds.net> wrote:
>
> Robert Hicks wrote:
> > idList only has about 129 id numbers in it.
>
> That is quite a few times to be searching each line of the file. Try
> using a regular expression search instead, like this:
>
> import re
> regex = re.compile('|'.join(idList))
> for line in f2:
>    if regex.search(line):
>      # process a hit
>
> A simple test shows that to be about 25 times faster.
>
> Searching for each of 100 id strings in another string:
> In [6]: import timeit
> In [9]: setup = "import re; import string; ids=[str(i) for i in
> range(1000, 1100)];line=string.letters"
> In [10]: timeit.Timer('for i in ids: i in line', setup).timeit()
> Out[10]: 15.298269987106323
>
> Build a regular expression to match all the ids and use that to search:
> In [11]: setup2=setup + ";regex=re.compile('|'.join(ids))"
> In [12]: timeit.Timer('regex.search(line)', setup2).timeit()
> Out[12]: 0.58947491645812988
>
> In [15]: _10 / _12
> Out[15]: 25.95236804820507
>
> > I am running it straight from a Linux console. I thought about buffering
> > but I am not sure how Python handles that.
>
> I don't think the console should be buffered.
>
> > Do you know if Python has a "slower" startup time than Perl? That could
> > be part of it though I suspect the buffering thing more.
>
> I don't know if it is slower than Perl but it doesn't take a few seconds
> on my computer. How long does it take you to get to the interpreter
> prompt if you just start Python? You could put a simple print at the
> start of your program to see when it starts executing.
>
> Kent
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20070626/fed9a86e/attachment.html 


More information about the Tutor mailing list