Stripping scripts from HTML with regular expressions
Nikita the Spider
NikitaTheSpider at gmail.com
Thu Apr 10 11:12:36 EDT 2008
In article <mailman.161.1207771905.17997.python-list at python.org>,
"Reedick, Andrew" <jr9445 at ATT.COM> wrote:
> > -----Original Message-----
> > From: python-list-bounces+jr9445=att.com at python.org [mailto:python-
> > list-bounces+jr9445=att.com at python.org] On Behalf Of Michel Bouwmans
> > Sent: Wednesday, April 09, 2008 3:38 PM
> > To: python-list at python.org
> > Subject: Stripping scripts from HTML with regular expressions
> >
> > Hey everyone,
> >
> > I'm trying to strip all script-blocks from a HTML-file using regex.
> >
>
> [Insert obligatory comment about using a html specific parser
> (HTMLParser) instead of regexes.]
Yah, seconded. To the OP - use BeautifulSoup or HtmlData unless you like
to reinvent wheels.
--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
More information about the Python-list
mailing list