[XML-SIG] Re: [4suite] (no subject)
Sylvain Thénault
Sylvain Thénault
Tue, 27 Aug 2002 18:15:01 +0200
On Tuesday 27 August à 18:04, Alexandre wrote:
> On Tue, Aug 27, 2002 at 05:30:15PM +0200, Tommy Sundström wrote:
> > Newbie question.
>
> Hi, this is not the right mailing list. You should ask questions about
> pyxml on the xml-sig mailing list (cc'ed to this answer)
>
> > Running this code:
> >
> > ---
> > import xml.dom.ext.reader.Sax2
> > from xml.dom.ext import PrettyPrint
> >
> > str = '''<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
> >
> > <h3><a href='page.htm'>Text</a></h3>'''
> >
> > doc = xml.dom.ext.reader.Sax2.FromXml(str)
> > PrettyPrint(doc.documentElement)
> > ---
> >
> > gives this result:
> > ---
> > <h3><a href='page.htm' shape='rect'>Text</a></h3>
> > ---
>
> This is perfectly normal
>
> > My question: where does the "shape='rect'" comes from (It's not added
> > unless the DOCTYPE element is there.)
>
> It comes from the DTD. Download it from the url in the doctype and see
> for yourself that the <a> element has a shape attribute with a default
> value of 'rect'.
>
> > Can it do any harm? Is there any way of surpressing it?
The parser is responsible for entities substitution, so you can
supressing it but you have to give a rightly configured parser
instance to FromXml:
parser = make_parser()
parser.setFeature(feature_external_ges, 0)
parser.setFeature(feature_external_pes, 0)
doc = xml.dom.ext.reader.Sax2.FromXml(str, parser=parser)
See http://www.python.org/doc/current/lib/module-xml.sax.handler.html
for a description of the different features.
Note that those features aren't recognized by all parser (pyexpat does
but not xmlproc)
--
Sylvain Thénault
LOGILAB http://www.logilab.org