[2.5] Regex doesn't support MULTILINE?

Carsten Haese carsten at uniqsys.com
Sat Jul 21 22:40:44 EDT 2007


On Sat, 2007-07-21 at 19:22 -0700, Paul Rubin wrote:
> Carsten Haese <carsten at uniqsys.com> writes:
> > Use an actual HTML parser such as BeautifulSoup
> > (http://www.crummy.com/software/BeautifulSoup/) and your life will be
> > much easier.
> 
> BeautifulSoup is a lot simpler to use than RE's but a heck of a lot
> slower.  I ended up having to use RE's last time I had to scrape a lot
> of pages.

True, but the OP said "extract information from a web page", not "from a
lot of pages." Until BeautifulSoup is actually too slow for that job,
going straight to RE is premature optimization.

-- 
Carsten Haese
http://informixdb.sourceforge.net





More information about the Python-list mailing list