xml.dom.minidom memory usage

Dan thermostat at gmail.com
Thu Feb 1 14:51:10 EST 2007


I'm using python's xml.dom.minidom module to generate xml files, and
I'm running into memory problems. The xml files I'm trying to create
are relatively flat, with one root node which may have millions of
direct child nodes.

Here's an example script:
#!/usr/bin/env python


import xml.dom.minidom


def gen_xml(n):
    doc  = xml.dom.minidom.Document()
    root = xml.dom.minidom.Element("foo")
    root.ownerDocument = doc
    root.setAttribute("one", "1")
    doc.appendChild(root)
    for x in xrange(n):
        elem = xml.dom.minidom.Element("bar")
        elem.ownerDocument = doc
        elem.setAttribute("attr1", "12345678")
        elem.setAttribute("attr2", "87654321")
        root.appendChild(elem)
    return doc


if I run gen_xml(1000000), my python process ends up using all my 90%
of my memory, and the system ends up thrashing (Linux, P4, 1G ram,
python 2.4.3) .

So, my questions are (1) am I doing something dumb in the script that
stops python from collecting temp garbage? (2) If not, is there
another reasonable module to generate xml (as opposed to parsing it),
or should I just implement my own xml generation solution?

Thanks,
-Dan




More information about the Python-list mailing list