Reg strip_tags function in Python.

Dave Benjamin dave.benjamin at gmail.com
Sat May 7 12:36:30 EDT 2005


praba kar wrote:
> In Php I can use strip_tags() function to strip out
> all html tags. I want to know that strip_tags()
> equivalent function in Python.

Here's a simple function based on Python's HTMLParser class. If you need 
to reject only certain tags, you'll probably want to subclass 
HTMLParser, but for simply stripping all tags, this works:

from HTMLParser import HTMLParser
def strip_tags(html):
     result = []
     parser = HTMLParser()
     parser.handle_data = result.append
     parser.feed(html)
     parser.close()
     return ''.join(result)

See the docs for more details on HTMLParser:
http://docs.python.org/lib/module-HTMLParser.html

Dave



More information about the Python-list mailing list