[XML-SIG] Exceptions on undefined character entities

Thomas B. Passin tpassin@home.com
Fri, 1 Feb 2002 09:53:00 -0500


[Frank McIngvale]
>
> Hi, I stumbled across this while fetching my usual
> rdf/rss files yesterday, and am hoping someone can
> explain what is happening:
>
> newsforge.com gave me a file containing this line:
>    <title>University of Osnabr&uuml;ck, Germany</title>
>
> minidom doesn't like it:
>.. Dr. David Mertz pointed out that this works:
>
> >>> s = "<!DOCTYPE title [<!ENTITY uuml '[fakechar]'>]><title>University
> of Osnabr&uuml;ck, Germany</title>"
> >>> minidom.parseString(s)
> <xml.dom.minidom.Document instance at 0x81571c4>
> >>>
>
> So my question is, what is the correct way to handle this? Is
> minidom supposed to handle it, is the caller supposed to provide
> the entities, or is it a bug in the XML file?
>
It's an XML thing- if the entity isn't defined (by an ENTITY declaration or
by being one of the five built-in entities) then  the parser has no way to
know what it stands for, and cannot process it.  Minidom is acting
correctly.

Tom P