xml.dom.minidom.parse() splitting text nodes?

hawkeye.parker at autodesk.com hawkeye.parker at autodesk.com
Thu Jan 16 18:23:45 EST 2003


i'm running into an odd issue parsing large xml files.  it appears that minidom is arbitrarily splitting some TEXT_NODEs into pieces.  for example, the file in question contains a number of these tags:
 
<C:Footer>This space provided for legal clarification of contract issues as defined by the project participants prior to project initiation. The content herein is determined withing the General Tab of the Log Properties dialogue box</C:Footer>
 
the parser correctly parses the C:Footer tag into a dom element, but for some reason *periodically* splits the child node into *two* text nodes.  i can find no ryhme or reason to the splitting, though it is consistent for a given file; i.e., it always splits the same nodes in the same place.
 
has anyone else run across this issue?  can you explain it?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20030116/95b60cb7/attachment.html>


More information about the Python-list mailing list