regular expressions: grabbing variables from multiple matches

Heather Lynn White hwhite at chiliad.com
Wed Jan 3 19:16:49 EST 2001


Suppose I have a regular expression to grab all variations on a meta tag,
and I will want to extract from any matches the name and content values
for this tag.

I use the following re

MetaTag=re.compile(


r'''<\s*?(meta|META)\s*?=\s*?"(?P<name>.*?)"\s*?(content|CONTENT)\s*?=\s*?"(?P<content>.*?)"\s*?>'''

)

now suppose I have an html document and I want to iterate through all the
meta tags in that document. If I only catch one, I would say

matches=MetaTag.match(body)
if matches:
	flds=matches.groupdict()
	name=flds["name"]
	content=flds["content"]
	print name, content

but this does not work if I use instead findall, to get multiple matches,
because findall returns a list of matches rather than a list of match
objects, unlike all the other functions.  Is there a way to extract these
variables in the way I have done above, but with many matches?

-heather













More information about the Python-list mailing list