Strategies for backwards compatibility when using pickle?

Heiko Wundram heikowu at ceosg.de
Mon May 3 20:48:40 EDT 2004


Am Dienstag, 4. Mai 2004 00:41 schrieb duncan:
> Does anyone have any advice on how to balance the conflicting interests
> of the developers who want to keep evolving the object model and users
> who need stable project persistance?

I've written a serialization library which may be of use here. What this 
library does is enable you to store class instances (and inheritance trees) 
with userdefined __store__ and __load__ functions, which use a dict (which is 
further serialized) to store the data to the stream and back. It doesn't 
naively only call the topmost class (like Pickle does with __getstate__ and 
__setstate__), but rather allows each base of the serialized class to store 
itself. A toplevel class can even deny being serialized, this means that the 
pickled data will be the immediate base of this class. If that class denies 
to serialize itself, the immediate base of that class will be serialized, 
etc. Another aspect is the fact that the serialized data doesn't contain the 
name of the class, rather it contains a reference to a registered name (such 
as MyLocalHostClass), which is looked up on unserialization. This mechanism 
is completely transparent, but allows you to pickle data for another program 
which may use different class names but implements the same interfaces.

This library offers several other gimmicks (it was actually designed to 
implement a secure Pickle which can be used over a network without fear of 
attacks involving unserializing classes which are not explicitly decreed as 
being unserializable), such as signing of portions of the data with a key 
(requires PyCrypto), and automated type-checking for the dictionary of data 
which is passed to __load__.

For storing basic types and recursive structures, it can do all that Pickle 
can, excepting the following construction (which can be serialized, but gets 
unserialized wrong, but without error):

x = {}
y = (x,)
x["hello"] = y

serialize(y)

If you're interested, I can send you the library, along with several example 
files showing how to implement user classes which correctly operate with this 
library.

Heiko.




More information about the Python-list mailing list