10GB XML Blows out Memory, Suggestions?

Mathias Waack M.Waack at gmx.de
Tue Jun 6 08:03:58 EDT 2006


axwack at gmail.com wrote:

> I wrote a program that takes an XML file into memory using Minidom. I
> found out that the XML document is 10gb.
> 
> I clearly need SAX or something else?

More memory;)
Maybe you should have a look at pulldom, a combination of sax and dom: it
reads your document in a sax-like manner and expands only selected
sub-trees. 

> Any suggestions on what that something else is? Is it hard to convert
> the code from DOM to SAX?

Assuming a good design of course not. Esp. if you only need some selected
parts of the document SAX should be your choice. 

Mathias



More information about the Python-list mailing list