Writing big XML files where beginning depends on end.

Neil Benn benn at cenix-bioscience.com
Thu Nov 24 08:09:48 EST 2005


Magnus Lycka wrote:

><snip>
>
>In some cases, building up a DOM tree in memory takes up
>several GB of RAM, which is a real showstopper. The actual
>file is maybe a magnitute smaller than the DOM tree. The
>app is using libxml2. It's actually written in C++. Some
>library that used much less memory overhead could be
>sufficient.
>
><snip>
>  
>
Hello,

          Regardless of the wisdom of having an XML file that big or 
sttructred in that way (you stated that you are forced to use this), if 
you need to run through such a large amount of data then use SAX rather 
than DOM, this runs on a stream based implementation so it can 
comfortbaly scael up to large amnounts of data such as yours.  To keep 
hold of the state between the start you'll have to do that manually 
(storing only the state stuff that matters) - to be honest I'd take a 
step back and look at using a different data representation.

Cheers,

Neil

-- 

Neil Benn
Senior Automation Engineer
Cenix BioScience
BioInnovations Zentrum
Tatzberg 46
D-01307
Dresden
Germany

Tel : +49 (0)351 4173 154
e-mail : benn at cenix-bioscience.com
Cenix Website : http://www.cenix-bioscience.com




More information about the Python-list mailing list