Regular Expression help

John Bokma john at castleamber.com
Thu Apr 27 18:50:37 EDT 2006


Edward Elliott <nobody at 127.0.0.1> wrote:

> johnzenger at gmail.com wrote:
>> If you are parsing HTML, it may make more sense to use a package
>> designed especially for that purpose, like Beautiful Soup.
> 
> I don't know Beautiful Soup, but one advantage regexes have over some
> parsers is handling malformed html.  Omitted closing tags can wreak
> havoc. Regexes can also help if you only want elements
> preceded/followed by a certain sibling or cousin in the parse tree. 
> It all depends on what you're trying to accomplish.  In general
> though, yes parsers are better suited to extracting from markup.

A parser can be written in such a way that it doesn't give up on malformed 
HTML. Probably less hard then coming up with regexes that handle HTML 
that's well-formed. (and that coming from a Perl programmer ;-) )

-- 
John                               MexIT: http://johnbokma.com/mexit/
                           personal page:       http://johnbokma.com/
        Experienced programmer available:     http://castleamber.com/
            Happy Customers: http://castleamber.com/testimonials.html



More information about the Python-list mailing list