Moving class used in pickle

Jean-Paul Calderone exarkun at divmod.com
Tue May 15 16:32:53 EDT 2007


On Tue, 15 May 2007 13:23:29 -0600, Jeffrey Barish <jeff_barish at earthlink.net> wrote:
>I have a class derived from string that is used in a pickle.  In the new
>version of my program, I moved the module containing the definition of the
>class.  Now the unpickle fails because it doesn't find the module.  I was
>thinking that I could make the unpickle work by putting a copy of the
>module in the original location and then redefine the class by sticking a
>__setstate__ in the class thusly:
>
>def __setstate__(self, state):
>    self.__dict__.update(state)
>    self.__class__ = NewClassName
>
>My plan was to specify the new location of the module in NewClassName.
>However, when I do this I get the message "'class' object layout differs
>from 'class'".  I don't think that they do as the new module is a copy of
>the old one.  I suspect that I am not allowed to make the class assignment
>because my class is derived from string.  What is the best way to update
>the pickle?  The only thought I have is to read all the data with the old
>class module, store the data in some nonpickle format, and then, with
>another program, read the nonpickle-format file and rewrite the pickle with
>the class module in the new location.

This is one of the reasons pickle isn't very suitable for persistence of
data over a long period of time: no schema (and so no schema upgrades), and
few tools for otherwise updating old data.

For your simple case, you can just make sure the module is still available
at the old name.  This will allow pickle to find it when loading the objects
and automatically update the data to point at the new name when the object
is re-pickled.  This doesn't give you any straightforward way to know when
the "upgrade" has finished (but maybe you can keep track of that in your
application code), and you need to keep the alias until all objects have been
updated.

For example, if you had a/b.py and you renamed it to a/c.py, then you can do
one of two things: in a/__init__.py, add a 'from a import b as c'.  Now a.b
is the same object as a.c, but if you have a class defined in a/c.py and you
access it via a.b, it will still "prefer" a.c (ie, its __module__ will be
'a.c', not 'a.b'); alternatively you can have b.py and import the relevant
class(es) from c.py into it - 'from a.c import ClassA'.

You could also build a much more complex system in the hopes of being able to
do real "schema" upgrades of pickled data, but ultimately this is likely a
doomed endeavour (vis <http://divmod.org:81/websvn/wsvn/Quotient/trunk/atop/versioning.py?op=file&rev=0&sc=0>).

Hope this helps,

Jean-Paul



More information about the Python-list mailing list