Regex problem
Gustaf Liljegren
gustafl at algonet.se
Mon Oct 15 00:05:55 EDT 2001
[Had trouble with the news server tonight. Sorry if you see this message
more than once.]
I'm trying to match either of the HTML elements <a> or <area>, containing
an 'href' attribute. Here's the regex I've made:
>>> re_link = re.compile(r'<(area|a)[^>]+href=".*"[^>]*/?>', re.I | re.M)
Works fine when I try it on a matching string:
>>> s1 = '<a href="page.html">'
>>> re.match(re_link, s1).group()
'<a href="page.html">'
But I only need to add a space before, and it won't work.
>>> s2 = ' <a href="page.html">'
>>> re.match(re_link, s2).group()
Traceback (most recent call last):
File "<pyshell#20>", line 1, in ?
re.match(re_link, s2).group()
AttributeError: 'None' object has no attribute 'group'
>>>
Regexes doesn't always have to match from the beginning! What's wrong here?
Gustaf Liljegren
More information about the Python-list
mailing list