[XML-SIG] ignoring "Undeclared entity" errors?

Andrew Clover and-xml at doxdesk.com
Tue Dec 7 18:49:53 CET 2004


Steven Knight <knight at baldmt.com> wrote:

> When I try to process an individual fragment without the declarations,
> I of course get "Undeclared entity" fatal errors for these entities,
> which terminates parsing of the fragment.

Yep. I think this comes from expat itself, so might not be avoidable 
without using/subclassing a different parser, eg. xmlproc.

If you aren't tied to SAX, the pxdom parser will cope with undeclared 
entities. (If you use a DOM Level 3 ErrorHandler it'll receive a 
DOMError 'pxdom-unbound-entity' with severity WARNING.)

> Having to declare a DTD just to be able to perform some specific
> preprocessing on an otherwise well-formed fragment of XML feels way
> too heavyweight to me.

For me too - pxdom's lenient behaviour was originally intended to allow 
PXTL to pass entities through without having to define them in a 
separate doctype for 'target doctype plus PXTL'.

Other Python DOM tools don't really support the idea of keeping hold of 
unexpanded entity references, so they can't do anything but complain if 
they get an undeclared one.

However note that in certain circumstances (in summary: when the parser 
can know that there are no unprocessed DTD declarations) undeclared 
entities are a well-formedness error instead of a validity error. So 
technically your document might not be well-formed, which would be a Bad 
Thing.

In summary, entities suck, make everything harder, and should have been 
left out of XML completely. </controversy>

-- 
Andrew Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/



More information about the XML-SIG mailing list