[XML-SIG] Thought it was a bug, maybe XML is weirder than I thought

Clarence Gardner clarence@netlojix.com
Sun, 1 Oct 2000 13:52:42 -0700


Context: I'm using PyXML-0.5.5.1, and interestingly, I never compiled
any of the C code; I just use the .py and it seems to work fine.

So I'm storing some arbitrary textual data under an arbitrarily-named
element node.  My test code created the xml and dumped it to a file,
where it looked like this:
  <username><![CDATA[fred]]></username>
This document is read and updated, and I noticed that each time I
added a new username (i.e., read the xml source, inserted a new username
node via DOM, and wrote back to the file), the previous ones changed from
CDATA to TEXT.  This seemed like a bug to me.  I thought I would see what
would happen if I added a username of "<markup test>".  The first time,
it appeared in the file as
  <username><![CDATA[<markup test>]]></username>
as expected, then after one more addition, it was now
  <username>&lt;markup test&gt;</username>
.  But now I see that, if I read that document, the username has not
one TEXT child, but three ('<', 'markup test', and '>').

Does all this seem right to people?  That last implies, of course, that
in order to get what I expect to be the text value of a node, I actually
have to get all of the text children and concatenate their values.  Which
would seem to be a problem if (I haven't tried this) I originally stored
two separate text children of the username node, because this would cause
them to be merged into one.


-- 
Clarence Gardner
Software Engineer
NetLojix Communications
clarence@netlojix.com