[issue17741] event-driven XML parser

Eli Bendersky report at bugs.python.org
Mon Aug 26 05:33:57 CEST 2013


Eli Bendersky added the comment:

I had a chat with Nick about this, and a variation of the proposal in http://bugs.python.org/msg196133 seems acceptable to us both. To summarize, here's what goes into 3.4.

* The class will be named EventParser.
Rationale: The Incremental in IncrementalParser is ambiguous, because it's not clear which side is incremental - input or output. In addition, XMLParser is already "incremental" in the sense that it has a feed() method. EventParser is (although not a perfect name) clearer w.r.t what it does - produces events from the parsing. Another (milder) reason is to avoid confusing duplication with xml.sax.xmlreader.IncrementalParser, which is somewhat different. Yet another is that if we eventually figure out how to implement this in a parser target, a great name for that target would be EventBuilder, which is a logical building block for EventParser.

* It will *not* accept a "parser" argument in the constructor.
Rationale: the parser argument of iterparse is broken anyway. This will make it much easier to modify the implementation of EventParser in the future when the C internals are fixed w.r.t problems mentioned in this issue.

* iterparse's "parser" argument will be deprecated, and the documentation will be more detailed w.r.t to the limitations on its current "parser" argument (the limitations are there in the code, but they're not fully documented).

* EventParser's input-side methods will be named feed and close, to be more consistent with existing XML APIs (both XMLParser and xml.sax.xmlreader.IncrementalParser).

* EventParser's output-side method will be called "read_events".
Rationale: "read_" helps convey that the consumption of events is destructive. A possible future addition of a complementary, non-destructive API - "peek_events" will be considered.

* Documentation will be modified to clarify exactly how EventParser differs from XMLParser and iterparse.

Also, new issue(s) will be created to address fixing specific implementation problems.

----

Nick, feel free to add/correct if I misunderstood or forgot something.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17741>
_______________________________________


More information about the Python-bugs-list mailing list