two ideoms at one blow: line-reading and regexp-matching

Gustavo Cordova gcordova at hebmex.com
Wed Feb 20 11:25:21 EST 2002


>
> for line in file.xreadlines():
>     parse(line)
> 
> xreadlines() does not load the whole file into memory and is 
> generally about as
> efficient as you can get.
> 
> The second item does however make re's more annoying in 
> python than they are in
> other languages where re's are native to the language.
>

Hmmm... so lets make them more readable.



>>> class WrapRX:
... 	def __init__(self, regex, String):
... 		self.regex = regex
... 		self.String = String
... 		self.pos = 0
... 	def __getitem__(self,k):
... 		# Ignore k, go for the next match.
... 		match = self.regex.search(self.String, self.pos)
... 		if not match:
... 			raise IndexError("No more regex matches.")
... 		self.pos = match.end()
... 		return match
... 
>>> some_string = "This is a lone string, with no friends or family."
>>> rx_vowels = sre.compile(r"[aeiou]+", sre.S)
>>> wrapped_re = WrapRX(rx_vowels,some_string)
>>> for match in wrapped_re:
... 	print "Found vowel: '%s'" % match.group(0)
... 	
Found vowel: 'i'
Found vowel: 'i'
Found vowel: 'a'
Found vowel: 'o'
Found vowel: 'e'
Found vowel: 'i'
Found vowel: 'i'
Found vowel: 'o'
Found vowel: 'ie'
Found vowel: 'o'
Found vowel: 'a'
Found vowel: 'i'
>>> 



Seems useful, *if* you wanna use it :-)

-gustavo




More information about the Python-list mailing list