10GB XML Blows out Memory, Suggestions?
Fredrik Lundh
fredrik at pythonware.com
Tue Jun 6 14:37:32 EDT 2006
K.S.Sreeram wrote:
> There's just NO WAY that the 10gb xml file can be loaded into memory as
> a tree on any normal machine, irrespective of whether we use C or
> Python. So the *only* way is to perform some kind of 'stream' processing
> on the file. Perhaps using a SAX like API. So (c)ElementTree is ruled
> out for this.
both ElementTree and cElementTree support "sax-style" event generation
(through XMLTreeBuilder/XMLParser) and incremental parsing (through
iterparse). the cElementTree versions of these are even faster than
pyexpat.
the iterparse interface is described here:
http://effbot.org/zone/element-iterparse.htm
</F>
More information about the Python-list
mailing list