[pickle] Different version of a class

Tim Peters tim_one at email.msn.com
Sun Jun 6 13:50:01 EDT 1999


[mlauer at asmoday-atm.rz.uni-frankfurt.de]
> my application uses pickle to store and load data.
>
> During developing cycle, many variables are added
> to the classes which are pickled. Whenever I load
> a file pickled with a previous version, some
> variables are not existent (naturally, because
> __init__ is not called). Is there an easy way
> to initialize the "new" variables or must I
> call __init__ - which would overwrite some of
> my saved variables.

As the 1.5.2 pickle docs hint,

    If you plan to have long-lived objects that will see many
    versions of a class, it may be worthwhile to put a version
    number in the objects so that suitable conversions can be
    made by the class's __setstate__() method.

For example, say you start with this class:

class Point:
    version = 1

    def __init__(self, x, y):
        print "in init w/ x, y", x, y
        self.x, self.y = x, y

    def __getstate__(self):
        return (Point.version, self.__dict__)

    def __setstate__(self, (version, dict)):
        if version == 1:
            self.__dict__ = dict
        else:
            raise ValueError("I'm only version %d of Point -- "
                             "can't set state from version %d" %
                             (Point.version, version))

When an instance is pickled, __getstate__ tells it to save away a 2-tuple,
consisting of the class version number and the instance dict.  On
unpickling, this 2-tuple is passed to __setstate__, which can do whatever it
wants.  In this case it's trivial because there is only one version of the
class, but at least version 1 of the class can detect it doesn't know how to
restore data from later versions!

Version 2 of the class may be:

class Point:
    version = 2

    def __init__(self, x, y):
        print "in init w/ x, y", x, y
        self.x, self.y = x, y
        import math
        self.dist = math.sqrt(x**2 + y**2)

    def __getstate__(self):
        return (Point.version, self.__dict__)

    def __setstate__(self, (version, dict)):
        if version == 1:
            self.__dict__ = dict
            import math
            self.dist = math.sqrt(self.x**2 + self.y**2)
        elif version == 2:
            self.__dict__ = dict
        else:
            raise ValueError("I'm only version %d of Point -- "
                             "can't set state from version %d" %
                             (Point.version, version))

That is, __getstate__ doesn't change, but Point.version gets bumped and
__setstate__  looks at the version number in effect at the time the pickle
was made.  Unpickling a version 1 instance now sprouts a dist attr by magic.

Note:  despite the simple-minded <wink> appeal of version numbers, in Python
I find it's almost always better to use hasattr or try/except; that is, to
probe the actual *behavior* rather than rely on always-poorly-documented
"version numbers".  So I'd be more likely to write these toy classes as:

class Point:
    version = 1

    def __init__(self, x, y):
        print "in init w/ x, y", x, y
        self.x, self.y = x, y

and later:

class Point:
    version = 2

    def __init__(self, x, y):
        print "in init w/ x, y", x, y
        self.x, self.y = x, y
        import math
        self.dist = math.sqrt(x**2 + y**2)

    def __setstate__(self, dict):
        self.__dict__ = dict
        if not hasattr(self, 'dist'):
            import math
            self.dist = math.sqrt(self.x**2 + self.y**2)

convolute-to-taste-ly y'rs  - tim






More information about the Python-list mailing list