[XML-SIG] fast dump/restore of an XML document?

Anthony Baxter Anthony Baxter <anthony@interlink.com.au>
Tue, 16 May 2000 20:19:54 +1000


>>> Anthony Baxter wrote
> >>> Greg Stein wrote
> > xml.utils.qp_xml parses XML into a very lightweight structure. Those
> > should be easily pickle-able. With a little bit of work, I bet they could
> > be marshalled.
> > 
> > It definitely isn't a solution out of the box, but (IMO) it is a great
> > head start on a Pythonic structure that is easily marshalled/pickled.

Something like this works quite nicely:

def qp_dump(obj,file):
    import cPickle,gzip
    cPickle.dump(obj,gzip.open(file, 'w', 4))

def qp_load(file)
    import cPickle,gzip
    cPickle.load(gzip.open(file))

on my test sparc, a 2.5M XML dump gets dumped out as a 990K gzipped pickle.

using the pyexpat full dom thing, it's 120s to build the DOM, 27s to
dump out the XML. 

with qp_xml.Parser().parse(file) it reads it in 22s, and dumps a gzipped
pickle in 14s.  It dumps a normal pickle in 13s, but it's an order of 
magnitude larger.

unfortunately the load time bites: 71s for the uncompressed pickle, 
340s(!) for the gzipped one. blah.

(3 runs each test, storing on local disk)

(ps: as an aside: marshallable? how can you make an object marshallable?)

Anthony