[XML-SIG] Re: [4suite] (no subject)

Sylvain Thénault Sylvain Thénault
Tue, 27 Aug 2002 18:15:01 +0200


On Tuesday 27 August à 18:04, Alexandre wrote:
> On Tue, Aug 27, 2002 at 05:30:15PM +0200, Tommy Sundström wrote:
> > Newbie question.
> 
> Hi, this is not the right mailing list. You should ask questions about
> pyxml on the xml-sig mailing list (cc'ed to this answer)
>  
> > Running this code:
> > 
> > ---
> > import xml.dom.ext.reader.Sax2
> > from xml.dom.ext import PrettyPrint
> > 
> > str = '''<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> >     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
> > 
> > <h3><a href='page.htm'>Text</a></h3>'''
> > 
> > doc = xml.dom.ext.reader.Sax2.FromXml(str)
> > PrettyPrint(doc.documentElement)
> > ---
> > 
> > gives this result:
> > ---
> > <h3><a href='page.htm' shape='rect'>Text</a></h3>
> > ---
> 
> This is perfectly normal
> 
> > My question: where does the "shape='rect'" comes from (It's not added 
> > unless the DOCTYPE element is there.)
> 
> It comes from the DTD. Download it from the url in the doctype and see
> for yourself that the <a> element has a shape attribute with a default
> value of 'rect'.
> 
> > Can it do any harm? Is there any way of surpressing it?

The parser is responsible for entities substitution, so you can 
supressing it but you have to give a rightly configured parser 
instance to FromXml:

parser = make_parser()
parser.setFeature(feature_external_ges, 0)
parser.setFeature(feature_external_pes, 0)
doc = xml.dom.ext.reader.Sax2.FromXml(str, parser=parser)

See http://www.python.org/doc/current/lib/module-xml.sax.handler.html
for a description of the different features.
Note that those features aren't recognized by all parser (pyexpat does
but not xmlproc)

-- 
Sylvain Thénault

  LOGILAB           http://www.logilab.org