Module RE, Have a couple questions

Francis Girard francis.girard at free.fr
Tue Mar 1 15:57:28 EST 2005


Le mardi 1 Mars 2005 21:38, Marc Huffnagle a écrit :
> My understanding of the second question was that he wanted to find lines
> which contained both words but, looking at it again, it could go either
> way.  If he wants to find lines that contain both of the words, in any
> order, then I don't think that it can be done without scanning the line
> twice (regex or not).

I don't know if it is really faster but here's a version that finds both words 
on the same line. My understanding is that re needs to parse the line only 
once. This might count on very large inputs.

=== Begin SNAP
## rewords.py

import re
import sys

def iWordsMatch(lines, word, word2):
  reWordOneTwo = re.compile(r".*((%s.*%s)|(%s.*%s)).*" % 
                            (word,word2,word2,word))
  return (line for line in lines if reWordOneTwo.match(line))
  
for line in iWordsMatch(open("rewords.py"), "re", "return"):
  sys.stdout.write(line)
=== End SNAP

Regards,

Francis Girard

  




More information about the Python-list mailing list