[Python-Dev] Fixing the XML batteries

Baptiste Carvello devel at baptiste-carvello.net
Fri Dec 16 17:40:02 CET 2011


Le 16/12/2011 07:53, Stefan Behnel a écrit :

> Additionally, the documentation on the xml.sax page would benefit from
> the following paragraph:
> 
> """
> [[Note: The xml.sax package provides an implementation of the SAX
> interface whose API is similar to that in other programming languages.
> Users who are unfamiliar with the SAX interface or who would like to
> write less code for efficient stream processing of XML files should
> consider using the iterparse() function in the xml.etree.ElementTree
> module instead.]]
> """
> 

A small caveat to note about iterparse(), which I otherwise like a lot:
when processing very big data (I encountered this with a region-wide
openstreetmap XML dump), you have to remove the processed nodes from the
root element. Otherwise, its memory footprint increases with the size of
the document.



More information about the Python-Dev mailing list