a regular expression question

Nicola Paolucci durdn at yahoo.it
Sat Mar 22 07:15:18 EST 2003


Hi Luke,

Luke wrote:
 > <a href="foo1">1</a> abc <a href="foo2">2</a> def <a href="foo3">3</a>
 > ghi <a href="foo4">4</a> jkl
 >
 > If I use re2, it works, but obviously only gets the odds since there
 > is no overlapping.  Is there a way to modify re1 to get the text, or
 > is there a way to overlap with python's re engine somehow?
 >>>>re1 = re.compile("<a .*?>([0-9]+?)</a>(.*?)")
 >>>>matches = re.findall(re1,text)
 >>>>matches
 >
 > [('1', ''), ('2', ''), ('3', ''), ('4', '')]

This worked for me:
 >>> re1 = re.compile("<a[^>]+>([0-9]+?)</a>([^<]*)")
 >>> print re.findall(re1,text)
[('1', ' abc '), ('2', ' def '), ('3', ' ghi '), ('4', ' jkl')]

Best regards,
	Nicola Paolucci





More information about the Python-list mailing list