Is sgmllib.py 's BUG?
Sean 'Shaleh' Perry
shalehperry at home.com
Thu Oct 18 01:39:24 EDT 2001
On 18-Oct-2001 limodou wrote:
> Sometimes I use python to analyse a HTML document. But I found that if
> there is a tag start with '<!' not '<!--', sgmllib with treat it as a
> 'special' pattern. It'll be ok mostly, occasionaly failed. Because
> sometimes someone can use tag '<!' for comment. I fix it by treat all
> '<!' as comment, but this will lost declaration like DOCTYPE. Anyone
> has some ideas?
at the start:
special = re.compile('<![^<>]*>')
then later:
match = special.match(rawdata, i)
if match:
if self.literal:
self.handle_data(rawdata[i])
i = i+1
continue
i = match.end(0)
continue
so if you want to handle <!DOCTYPE> it needs to be in a data handler.
-----
We have buried the putrid corpse of Liberty. -- Benito Mussolini
More information about the Python-list
mailing list