[New-bugs-announce] [issue34600] python3 regression ElementTree.iterparse() unable to capture comments

Martin Hosken report at bugs.python.org
Fri Sep 7 02:10:16 EDT 2018


New submission from Martin Hosken <martin_hosken at sil.org>:

This is a regression from python2 by being forced to use cElementTree.

I have code that uses iterparse to process an XML file, but I also want to process comments and so I have a comment handling function called by the parser during iterparse. Under python3 I can find no way to achieve the same thing:

```
parser = et.XMLParser(target=et.TreeBuilder())
parser.parser.CommentHandler = myCommentHandler
for event, elem in et.iterparse(fh, parser=parser):
    ...
```

Somewhat ugly but works in python2, but I can find no way to set a comment handler on the parser in python3.


1. There is no way(?) to get to xml.etree.ElementTree.XMLParser since the C implementation completely masks the python versions.
2. It is possible to create a subclass of TreeBuilder to add a comment method. But the C version XMLParser requires that its TreeBuilder not be a subclass, when used in iterparse.

The only solution I found was to copy the XMLParser code out of ElementTree into a private module and use that pure python implementation.

Suggested solutions:
1. Allow access to all the python implementations in ElementTree and not just Element.
2. Allow a comments method to be passed to the XMLParser on creation.

Thank you.

----------
components: XML
messages: 324719
nosy: Martin Hosken
priority: normal
severity: normal
status: open
title: python3 regression ElementTree.iterparse() unable to capture comments
type: behavior
versions: Python 3.6

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34600>
_______________________________________


More information about the New-bugs-announce mailing list