xml.dom.minidom: how to preserve CRLF's inside CDATA?

sim.sim Maksim.Kasimov at gmail.com
Tue May 22 09:45:05 EDT 2007


Hi all.
i'm faced to trouble using minidom:

#i have a string (xml) within CDATA section, and the section includes
"\r\n":
iInStr = '<?xml version="1.0"?>\n<Data><![CDATA[BEGIN:VCALENDAR\r
\nEND:VCALENDAR\r\n]]></Data>\n'


#After i create DOM-object, i get the value of "Data" without "\r\n"

from xml.dom import minidom
iDoc = minidom.parseString(iInStr)
iDoc.childNodes[0].childNodes[0].data # it gives u'BEGIN:VCALENDAR
\nEND:VCALENDAR\n'


according to http://www.w3.org/TR/REC-xml/#sec-line-ends

it looks normal, but another part of the documentation says that "only
the CDEnd string is recognized as markup": http://www.w3.org/TR/REC-xml/#sec-cdata-sect

so parser must (IMHO) give the value of CDATA-section "as is" (neither
both of parts of the document do not contradicts to each other).


How to get the value of CDATA-section with preserved all symbols
within? (perhaps use another parser - which one?)


Many thanks for any help.




More information about the Python-list mailing list