pickle performance on larger objects
Jeremy Hylton
jeremy at alum.mit.edu
Thu Jul 18 16:55:54 EDT 2002
Sam Penrose <spenrose at intersight.com> wrote in message news:<mailman.1026943747.15769.python-list at python.org>...
> memory usage increases by about 20%, FWIW. For my particular use case
> cPickle is still several (many ?) times slower than just recreating the
> object by reading in a file. What implications this has for best
> practices in persistence of larger objects I do not know, but I hope the
> data point is of interest to others.
MWH's comment about marshal is worth keeping in mind. cPickle is
doing a lot of work that marshal isn't doing. Where marshal calls
fwrite() directly in most cases, cPickle wraps fwrite() in a C
function that w/ Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. All
those extra C functions certainly add up when you've got some many
objects.
It's also checking for cycles in the containers, which means it has to
do a lot of extra bookkeeping for each dict it finds. You can disable
that by setting the fast attribute on the pickler:
p = cPickle.Pickler()
p.fast = 1
Jeremy
More information about the Python-list
mailing list