10GB XML Blows out Memory, Suggestions?
Fredrik Lundh
fredrik at pythonware.com
Tue Jun 6 15:52:44 EDT 2006
gregarican wrote:
> 10 gigs? Wow, even using SAX I would imagine that you would be pushing
> the limits of reasonable performance.
depends on how you define "reasonable", of course. modern computers are
quite fast:
> dir data.xml
2006-06-06 21:35 1 002 000 015 data.xml
1 File(s) 1 002 000 015 bytes
> more test.py
from xml.etree import cElementTree as ET
import time
t0 = time.time()
for event, elem in ET.iterparse("data.xml"):
if elem.tag == "item":
elem.clear()
print time.time() - t0
gives me timings between 27.1 and 49.1 seconds over 5 runs.
(Intel Dual Core T2300, slow laptop disks, 1000000 XML "item" elements
averaging 1000 byte each, bundled cElementTree, peak memory usage 33 MB.
your milage may vary.)
</F>
More information about the Python-list
mailing list