Stripping HTML with RE

Steveo stephen_p_barrett at hotmail.com
Tue Nov 9 18:06:35 EST 2004


I am currently stripping HTML from a string with the following code. 
(I know it's not the best way to strip HTML but bear with me)

re.compile("<.*?>")

I wanted to allow all H1 and H2 tags so i changed it to:

re.compile("<[^H1|^H2]*?>")

This seemed to work but it also allowed the HTML tag(basically anythin
with an H or a 1 or a 2)  How can I get this to strip all tags except
H1 and H2.  Any Help you could give would be great.

Steve



More information about the Python-list mailing list