[Python-Dev] pickling of large arrays

Tim Peters tim.one@comcast.net
Thu, 20 Feb 2003 12:02:19 -0500


[Ralf W. Grosse-Kunstleve]
> ...
> My little prototype below works with Python 2.3a2!
>
> This is almost perfect. In C++ we can have two overloads.
> Highly simplified:
>
> template <typename T>
> class array<T> {
>   void append(T const& value); // the regular append
>   void append(std::string const& value); // for unpickling
> };
>
> This will work for all T (e.g. int, float, etc.)
> ... except T == std::string.
>
> This leads me to find it unfortunate that append() is re-used for
> unpickling.

append() has always been used for unpickling (well, since pickle came into
existence; "always" is an overstatement <wink>).

> How about:
>
>   If the object has a (say) __unpickle_append__ method this is used by
>   the unpickler instead of append or extend.

This is the implementation of the APPEND opcode (from pickle.py):

    def load_append(self):
        stack = self.stack
        value = stack.pop()
        list = stack[-1]
        list.append(value)

It's called once per list element, and clogging it up with
hasattr()/getattr() calls would greatly increase its cost (as is, it does
very little, and especially not in cPickle where all those
now-usually-trivial operations go at C speed).

So it's unlikely you're going to get a change in what the proto 0 APPEND
(which calls append()) and proto 1 APPENDS (which calls extend()) opcodes
do.  Adding brand new opcodes is still possible, but I doubt it's possible
for Guido or me to do the work.

> ...
> f = int_array(range(11))
> print "f.elems:", f.elems
> s = pickle.dumps(f)

Note that this is creating a text-mode ("protocol 0") pickle, which is less
efficient than proto 1, which in turn is less effiecient than proto 2.
That's your choice, just want to be sure you're aware you're making a
choice.  For backward compatibility, proto 0 has to remain the default.