[xml-sig] Inconsistency in handling of newline characters

bkline at rksystems.com bkline at rksystems.com
Fri Feb 25 17:16:54 CET 2011


Does it surprise anyone else that both minidom and lxml (possibly using the 
same parser under the covers) don't treat "\r\n" and "
" as 
equivalent?  I would have expected, based on 
http://www.w3.org/TR/REC-xml/#sec-line-ends, to get back "\n" as the value 
of the text node in both cases.  That's not what happens, however.  If the 
string is serialized in what the parser takes in as "\r\n" what comes out is 
"\n" (as I expected), but if it's serialized as "
" it comes out as 
"\r\n"!  Seems like either a flaw in the parser(s) or in the spec.  If the 
spec (which I don't claim to have fully understood) really says these two 
representations of the two-character sequence should be treated differently, 
I haven't been able to find any rationale for why the line-ending 
normalization wouldn't operate on the characters represented by either 
serialization.  Any clues?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/xml-sig/attachments/20110225/c7c57e2d/attachment-0001.html>


More information about the xml-sig mailing list