[Tutor] Fastest way to iterate through a file

Kent Johnson kent37 at tds.net
Tue Jun 26 18:26:50 CEST 2007


Robert Hicks wrote:
> Kent Johnson wrote:
>> Robert Hicks wrote:
>>> idList only has about 129 id numbers in it.
>> That is quite a few times to be searching each line of the file. Try 
>> using a regular expression search instead, like this:
>>
>> import re
>> regex = re.compile('|'.join(idList))
>> for line in f2:
>>    if regex.search(line):
>>      # process a hit
>>
> 
> Since I am printing to a file like so:
> 
> [id]: [line]
> 
> I don't see how I can get the id back out of the regex.search like I 
> could in my code.

match = regex.search(line)
if match:
   idMatch = match.group()
   print idMatch, line

Note this will only find the first id match on a line. If you need to 
find multiple matches per line use findall() or finditer() as Dave suggests.

re docs are here:
http://docs.python.org/lib/module-re.html

Kent


More information about the Tutor mailing list