Providing backwards compatibility for serialized objects

Sat Jul 3 15:40:20 EDT 2004

	I've done something quite similar. I had two ways of doing it :

	1- use default parameters

	After deserializing, call a method on the object which will inspect it  
and set all undefined fields to a default value. This method can have some  
intelligence and use the right default value according to the state of the  
object.

	Good things :
		- Simple
		- Fast
	Problems :
		- Does not handle all problems correctly (a default is still a default)
		- Not as evolved as the next solution
		- Not very "clean"

	2- serialize version information

	When serializing, save a "format version" identifier, which says which  
version of the object you have.
	When loading the object, read this version spec and act accordingly,  
setting default values, changing members, to bring the object to the  
latest version. If you save it again, it'll be using the latest format  
version.

	Good things :
		- Foolproof (encoded version number eliminates guessing)


	Problem with your problem :
	Serializing does not save the class methods and associated code, only the  
object contents.
	Thus if you update a method in class Foo, both the "old" and "new" object  
will be deserialized into a new "Foo" instance with associated methods.  
Thus if a method expects to find a member in new Foo which does not exist  
in old Foo, it will fail. But if a method has different parameters like in  
your example, you'll always get the new method, which is the one in your  
class definition.

	Thus, if you choose your default values well and code your methods  
accordingly, everything should be okay when you change versions.

	However, if you want the two versions of Foo to behave differently across  
versions, you must make them two different classes. Say NewFoo and OldFoo  
are subclasses of BaseFoo for instance. However, you'll have ot maintain  
twice the code, which quickly becomes hell.


> Greetings!
>
> Say that it's desirable to provide backwards compatibility for methods  
> of an object, consider the case where...
>
> class Foo:
>      def bar (self, a, b):
>          pass
>
> ...is a defined class that can be serialized and later be deserialized.  
> This object is later changed so that it's defined as...
>
> class Foo:
>      def bar (self, a, b, c): # note the different argument spec
>          pass
>
> ...old versions of Foo can still be deserialized but the new code relies  
> on the fact that new version has one more argument in the spec. In the  
> case were you wanted to provide backwards compatibility as general  
> solution you could do...
>
> try:
>      fooObject.bar(a, b, c)
>
> except TypeError:
>
>      import sys
>
>      if sys.exc_traceback.tb_next is not None:
>
>          # Don't capture the exception if the traceback object
>          # has more than one level, this *should* handle the case
>          # were there is a TypeError inside of the .bar method
>
>          raise
>
>      # else call it with the old signature
>      fooObject.bar(a, b)
>
> ...what are the draw backs of using this approach? Are there any cases  
> were this would break?
>
> The better approach is just to break backwards compatibility or provide  
> a new method with a different name or another version of the class --  
> but consider the case were you don't have the luxury of proper design  
> decisions up front.
>
> TIA,
> Jason