What means exactly "Memory error"?

Fredrik Lundh fredrik at pythonware.com
Fri Apr 25 13:51:41 EDT 2003


Albert Hofkamp wrote:

> Do you need all data DOMmed at once?
> You may be able to have one DOM tree at a time, dropping and reloading
> everytime you switch file.

an alternative is to use an incremental tree builder, and process the
subtrees as they arrive.

here's an example, using my elementtree module:

    from elementtree import ElementTree

    class MyBuilder(ElementTree.TreeBuilder):

        def end(self, tag):
            elem = ElementTree.TreeBuilder.end(self, tag)
            if elem.tag == "SCENE":
                # process(elem) in some way, and write it out
                # ElementTree.ElementTree(elem).write(sys.output)
                elem.clear() # nuke it

    parser = ElementTree.XMLTreeBuilder()
    parser._target = MyBuilder() # plug in a custom builder!

    tree = ElementTree.parse(filename, parser)

I've tested this with a 10 megabyte XML file created by concatenating
Jon Bosak's Hamlet XML file over and over again, and wrapping it all in a
single document element.

the resulting file contains 720 scenes (about 15k each, in average).

the above script requires about 4.5 megabytes to run to completion, and
about 2 minutes processing time (on a really slow machine).

if I comment out the elem.clear() call, the script requires about 75 mega-
bytes, in about 15 minutes (13 of which were spent on swapping; I ran the
test on a machine with 96 megabytes RAM and really slow disks... ;-)

for more information on element trees, see:

    http://effbot.org/zone/element-index.htm
    http://www.xml.com/pub/a/2003/02/12/py-xml.html

</F>








More information about the Python-list mailing list