regex: im getting better
Duncan Booth
duncan at NOSPAMrcp.co.uk
Thu Oct 3 04:26:20 EDT 2002
":B nerdy" <thoa0025 at mail.usyd.edu.au> wrote in
news:uDLm9.21074$kd3.60008 at news-server.bigpond.net.au:
> $pattern = '|<input(\s+([^=>]*)="([^"]*)")*>|ism';
>
> i'd like to match all the input tags's but also in a subexpression,
> i'd like to match each of the parameters in the format
> parameter_name="parameter_value"
> where parameter_name and parameter_value are strings
>
> my pattern doesnt work, it only matches the last parameter, whats
> wrong with my pattern? and can someone show me how one would match my
> description above?
>
> cheers
>
Personally I wouldn't even consider using regular expressions for a parsing
task like this. Try the code below instead:
import sgmllib
class MyParser(sgmllib.SGMLParser):
def do_input(self, attributes):
print "Input tag",attributes
if __name__=='__main__':
data = '''
<html>
<body>
<input x="1" y="2">
<input p="q" r="s">
</body>
</html>
'''
parser = MyParser()
parser.feed(data)
parser.close()
--
Duncan Booth duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
More information about the Python-list
mailing list