[issue9521] xml.etree.ElementTree strips XML declaration and procesing instructions

Nikolaus Rath report at bugs.python.org
Sun Jan 19 05:52:22 CET 2014


Nikolaus Rath added the comment:

I can confirm this. The actual problem is that neither XML nor SGML PIs  in the input make it into the etree, and no events are generated for them during incremental parsing.

XML PIs that are added into the tree using Python functions are correctly written out. SGML PIs currently cannot be represented at all (there's no ElementTree.SGMLProcessingInstruction analogous to ElementTree.ProcessingInstruction)

There is special cased support for the DOCTYPE element in the TreeBuilder class to allow retrieving the doctype when not parsing incrementally, but it needs to be retrieved manually and written out manually.


I have attached a testcase for XML PIs.  For proper SGML PI handling, ElementTree first needs to learn about them.

Recommended stage for this issue: needs patch

----------
keywords: +patch
nosy: +Nikratio
versions: +Python 3.3, Python 3.4 -Python 3.1
Added file: http://bugs.python.org/file33538/testcase.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9521>
_______________________________________


More information about the Python-bugs-list mailing list