[XML-SIG] xml.parsers.expat not converting aliased CDATA elements

Fred L. Drake, Jr. fdrake@acm.org
Fri, 25 May 2001 16:50:59 -0400 (EDT)


Martin v. Loewis writes:
 > Fred already mentioned the default handler, but I'd like you to
 > reconsider your request: & and & are really the same thing; one is
 > marked-up, the other is not.

  I only wish it were that easy!  In cases where you want to preserve
the input as much as possible, it can be important to distinguish
between an internal entity reference and the expansion:

<!DOCTYPE doc [
    <!ENTITY MyEmployer "Digital Creations">
]>
<doc>&MyEmployer;</doc>

  Now, if I want to load the document into a DOM, modify a few things,
and dump it back out for further human editing, I want the entity
references intact.  With Expat, the only way I've found to do this is
to use the DefaultHandler to capture this information.  Whether or not
the text is expanded directly or made a child of an entity reference
node should be determined by the application.  The DOM Level 3
Load/Save working draft has knobs to control this behavior.
  (If anyone knows a way to determine whether a document contains &lt;,
&#60;, &#x3c;, or &#x3C;, I'd love to hear about it!)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations