pickle performance on larger objects

Jeremy Hylton jeremy at alum.mit.edu
Thu Jul 18 16:55:54 EDT 2002


Sam Penrose <spenrose at intersight.com> wrote in message news:<mailman.1026943747.15769.python-list at python.org>...
> memory usage increases by about 20%, FWIW. For my particular use case  
> cPickle is still several (many ?) times slower than just recreating the 
> object by reading in a file. What implications this has for best 
> practices in persistence of larger objects I do not know, but I hope the 
> data point is of interest to others.

MWH's comment about marshal is worth keeping in mind.  cPickle is
doing a lot of work that marshal isn't doing.  Where marshal calls
fwrite() directly in most cases, cPickle wraps fwrite() in a C
function that w/ Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS.  All
those extra C functions certainly add up when you've got some many
objects.

It's also checking for cycles in the containers, which means it has to
do a lot of extra bookkeeping for each dict it finds.  You can disable
that by setting the fast attribute on the pickler:

p = cPickle.Pickler()
p.fast = 1

Jeremy



More information about the Python-list mailing list