pickle's backward compatibility

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Tue Oct 13 11:06:52 EDT 2009


On 02:48 pm, mal at egenix.com wrote:
>exarkun at twistedmatrix.com wrote:
>>On 03:17 pm, pengyu.ut at gmail.com wrote:
>>>Hi,
>>>
>>>If I define my own class and use pickle to serialize the objects in
>>>this class, will the serialized object be successfully read in later
>>>version of python.
>>>
>>>What if I serialize (using pickle) an object of a class defined in
>>>python library, will it be successfully read in later version of
>>>python?
>>
>>Sometimes.  Sometimes not.  Python doesn't really offer any guarantees
>>regarding this.
>
>I think this needs to be corrected: the pickle protocol versions are
>compatible between Python releases, however, there are two things to
>consider:
>
>* The default pickle version sometimes changes between minor
>   releases.
>
>   This is easy to handle, though, since you can provide the pickle
>   protocol version as parameter.
>
>* The pickle protocol has changed a bit between 2.x and 3.x.
>
>   This is mostly due to the fact that Python's native string
>   format changed to Unicode in 3.x.

The pickle protocol isn't the only thing that determines whether an 
existing pickle can be loaded.  Consider this very simple example of a 
class which might exist in Python 2.x:

    class Foo:
        def __init__(self):
            self._bar = None

        def bar(self):
            return self._bar

Nothing particularly fancy or interesting going on there.  Say you write 
a pickle that includes an instance of this class.

Now consider this modified version of Foo from Python 2.(x+1):

    class Foo(object): # The class is new-style now, because someone felt 
like
                       # making it new style

        def __init__(self, baz):  # The class requires an argument to 
__init__
                                  # now to specify some new piece of info

             self.barValue = None  # _bar was renamed barValue because 
someone
                                   # thought it would make sense to 
expose the
                                   # info publically

             self._baz = baz

        def bar(self):
             return self.barValue  # Method was updated to use the new 
name of
                                   # the attribute

Three fairly straightforward changes.  Arguably making Foo new style and 
adding a required __init__ argument are not backwards compatible changes 
to the Foo class itself, but these are changes that often happen between 
Python releases.  I think that most people would not bother to argue 
that renaming "_bar" to "barValue" is an incompatibility, though.

But what happens when you try to load your Python 2.x pickle in Python 
2.(x+1)?

First, you get an exception like this:

  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib/python2.5/pickle.py", line 1374, in loads
      return Unpickler(file).load()
    File "/usr/lib/python2.5/pickle.py", line 858, in load
      dispatch[key](self)
    File "/usr/lib/python2.5/pickle.py", line 1070, in load_inst
      self._instantiate(klass, self.marker())
    File "/usr/lib/python2.5/pickle.py", line 1060, in _instantiate
      value = klass(*args)
  TypeError: in constructor for Foo: __init__() takes exactly 2 arguments 
(1 given)

But let's say the class didn't get changed to new-style after all... 
Then you can load the pickle, but what happens when you try to call the 
bar method?  You get this exception:

  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 6, in bar
  AttributeError: Foo instance has no attribute 'barValue'

So these are the kinds of things I am talking about when I say that 
there aren't really any guarantees.

Jean-Paul



More information about the Python-list mailing list