Handling bad tags with SGMLParser

Duncan Booth duncan at NOSPAMrcp.co.uk
Fri Mar 8 03:37:14 EST 2002


ken at ineffable.com (Ken Causey) wrote in 
news:6c80b2a1.0203070848.6a25b499 at posting.google.com:

> I'm running into a problem with sgmllib in Python 2.1.2 with tags of
> the form:
> 
><![blah]> where 'blah' could be anything.
> 
This is mostly just a 'me too' post, but I have recently been running into 
the same problem. I have a script I use to clean up 'html' produced by 
Microsoft Word and these days I have to preprocess the html to remove the 
bad directives.


-- 
Duncan Booth                                             duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?



More information about the Python-list mailing list