[XML-SIG] CDATA sections still not handled

Ken MacLeod ken@bitsko.slc.ut.us
17 Jan 2001 17:32:09 -0600


Matt,

If I understand this thread correctly, it's the common "how do I pass
XML inside XML" question.

CDATA sections are not relevant to this question.  These two XML
fragments are equivalent for all practical purposes:

  <my-tag><[CDATA[Some <tags> &amp; &entities; inside XML]]></my-tag>

  <my-tag>Some &lt;tags> &amp;amp; &amp;entities; inside XML</my-tag>

In both cases your application will see:

  startElement()  with element name 'my-tag'
  characters()    with data "Some <tags> &amp; &entities; inside XML"
  endElement()    with element name 'my-tag'


That the data "is" XML is also not relevant to this question, it could
be any type of data that contains markup characters.

If you want to "do something with the XML" inside the XML, the easiest
way is to use another instance of a parser to parse the string as XML.

If you are interested in preserving the fact that the original file
used a CDATA section to escape the markup, instead of entities to
escape the markup, I believe SAX2 does provide that information, but
you need to evaluate whether or not that really does what you want.
Besides downplaying CDATA sections, a SAX parser is going to normalize
a lot of other characters from the original file before it passes it
to you, in such a way that you really can't reproduce the original
file.

Does that help?

  -- Ken