help with cr in reg exp...

Peter Otten __peter__ at web.de
Sat Jan 17 06:35:32 EST 2004


GrelEns wrote:

> hello,
> 
> i had a trouble with re that i didn't understand (this is a silly example
> to show, to parse html i use sgmllib) :
> having this string :
> 
>>>> s = """<form name="test" method="post" action="test.php">
> <input type="text" name="title" size="1." value="test...">
> <br>
> <a href="help.php">help</a>
> </form>"""
> 
> why do i get :
> 
>>>> p = re.compile("(?=<form|<FORM).*(?=</form>|</FORM>)"); p.findall(s)
> []
> 
> while i was expected this kind of behaviour :
> ['form name="test" method="post" action="test.php">\n<input type="text"
> name="title" size="1." value="test...">\n<br>\n<a
> href="help.php">help</a>']
> 
> which what i nearly get with :
>>>> p = re.compile("(?=<form|<FORM).*(?=</form>|</FORM>)");
> p.findall(s.replace('\n', ''))
> ['<form name="test" method="post" action="test.php"><input type="text"
> name="title" size="1." value="test..."><br><a href="help.php">help</a> ']
> 
> it looks like \n isn't matched by . (dot)* in my re while i though (and
> need) it should, i must be missing something.
> 
> thanks!

Try re.compile(yourpattern, re.DOTALL)

Peter




More information about the Python-list mailing list