Does Python mess with CRLFs?
Irmen de Jong
irmen.NOSPAM at xs4all.nl
Wed Nov 12 13:52:13 EST 2008
Gilles Ganault wrote:
> Hello
>
> I'm stuck at understanding why Python can't extract some bit from an
> HTML file using regexes, although I can find it just fine with
> UltraEdit.
>
> #BAD
> friends = re.compile('</td></tr></table>\r\n</div>\r\n',re.IGNORECASE
> | re.MULTILINE | re.DOTALL)
If you keep running into trouble and you're sure it's related to the newlines,
maybe it helps using the 'whitespace' symbol instead of \r\n in your expression:
re.compile('</td></tr></table>\\s*</div>\\s*', .... )
Other than that, hard to say what's not working as expected without knowing
the exact contents of the "content.html" file you're searching in....
--irmen
More information about the Python-list
mailing list