sgmllib.py not good at handling <br/>
Chris Withers
chrisw at nipltd.com
Mon May 14 10:56:21 EDT 2001
Chris Withers wrote:
>
> Gilles Lenfant wrote:
> >
> > Hmmm.
> >
> > Aren't constructs like <tag/> a XML specific feature for empty elements ?
> > Your sample is XHTML (HTML from XML) rather than traditional HTML (from
> > SGML).
> > AFAIK, SGML empty elements don't need the trailing "/".
> > Try to use xmllib in place of sgmllib (your code will perhaps need some
> > rework).
>
> So is SGML a subset of XML?
>
> This code is for my HTML filtering module:
> http://www.zope.org/Members/chrisw/StripOGram
Damn keyboard ;-)
Anyway, my main concern is preventing people smuggling dodgy tags through like
so:
>
> html2safehtml ('Roses <b>are</B> red,<br/<blink>QUACK<//blink> violets '
> '<i>are</i> blue',
> valid_tags=['b','i','br'])
>
> successfully smuggling a <blink>...</blink> inside the result:
>
> 'Roses <b>are</b> red,<br><blink>QUACK</blink> violets <i>are</i> blue'
>
> (Notice that the closing '</i>' is now OK again, and that I had to use
> '<//blink>' in order to get '</blink>'.
Would xmllib.py be the way to go for this? How fast is that compared to
sgmllib.py?
cheers,
Chris
More information about the Python-list
mailing list