[Tutor] regex problem
Alan Gauld
alan.gauld at freenet.co.uk
Wed Jan 5 23:13:55 CET 2005
> > Using regex to remove HTML is usually the wrong approach unless
>
> Thanks. This is one of those projects I've had in mind for a long
> time, decided it was a good way to learn some python.
It's a good way to write increasingly complex regex! Basically
because HTML is recursive in nature it is almost impossible
to reliably use regex to parse HTML files. (The latest regex
syntax can cope with recursion but its horribly complicated)
So unless you accept the limitations of the method you may
well become more frustrated by the regex stuff than you
become experienced in Python.
Alan G.
"When all you have is a hammer everything looks like a nail"
More information about the Tutor
mailing list