[Python-Dev] For review: PEP 307 - Extensions to the pickle protocol

Tim Peters tim_one@email.msn.com
Sun, 9 Feb 2003 03:20:58 -0500


[M.-A. Lemburg]
> Which make me think: wouldn't a complete redesign from scratch under
> a new name provide more room for optimizations ?

PLabs doesn't have time for that (or are you volunteering <wink>?), and it's
unclear that there's significant room for optimization anyway.  There's
really no fat in the current scheme for Python's builtin types or classic
classes, and protocol 2 was mostly about cutting the fat for new-style
classes.  Some of proto 2 is even 2nd- or 3rd-order optimization, such as
minimizing the temp space required when unpickling large lists and dicts,
and implementing iinear-time pickling of Python longs.  The only obvious
waste now seems the production of many useless PUT (including its proto 1
variants) opcodes, but there's no way to stop that without turning pickling
into a multipass process.  Then it costs more pickling time and temp memory
to reduce pickle size and to reduce unpickling time and temp memory by some
unknown amount.  I don't know of any pure wins to be had.

> (after all, you are doing this for Zope's ZODB, right ?)

That's probably Zope Corp's primary interest.  I started whining about it
for the datetime module's objects, which proved *almost* impossible to
pickle efficiently in proto 1.  Lots of stuff was added to proto 2 to
address that.  In the end, though, Guido found a clean way to pickle
datetime module objects under proto 1, at the cost of giving the objects
slightly bizarre __new__ methods.

> ...
> I'm not using that. What I am using are the .save_*()
> and .load_*() method signatures and the .file attribute;
> plus the .disptach table, of course, which I have to update
> after overriding the methods.

Those all still exist, but there's not enough detail here to say much more
than that.  If, for example, you're overriding save_dict(), then of course
you're not going to get the "batching" optimizations implemented in 2.3
pickle.py's save_dict() (these reduce temp space needed when unpickling
large dicts, and produce proto 1 opcodes, so they're of some value even if
you're not using proto 2).