[Python-Dev] Unpickling py2 str as py3 bytes (and vice versa) - implementation (issue #6784)

Guido van Rossum guido at python.org
Tue Mar 13 23:08:35 CET 2012


On Tue, Mar 13, 2012 at 2:50 PM, Merlijn van Deen <valhallasw at arctus.nl> wrote:
> On 13 March 2012 22:13, Guido van Rossum <guido at python.org> wrote:
>> Well, since trying to migrate data between versions using pickle is
>> the "wrong" thing anyway, I think the status quo is just fine.
>> Developers doing the "right" thing don't use pickle for this purpose.
>
> I'm confused by this. "The pickle serialization format is guaranteed
> to be backwards compatible across Python releases" [1], which - at
> least to me - suggests it's fine to use pickle for long-term storage,
> and that reading this data in new Python versions is not a "bad"
> thing to do. Am I missing something here?
>
> [1] http://docs.python.org/library/pickle.html#the-pickle-protocol

That was probably written before Python 3. Python 3 also dropped the
long-term backwards compatibilities for the language and stdlib. I am
certainly fine with adding a warning to the docs that this guarantee
does not apply to the Python 2/3 boundary. But I don't think we should
map 8-bit str instances from Python 2 to bytes in Python 3.

My snipe was mostly in reference to the many other things that can go
wrong with pickled data as your environment evolves -- if you're not
careful you can have references (by name) to modules, functions,
classes in pickled data that won't resolve in a later (or earlier!)
version of your app, or you might have objects that are unpickled in
an incomplete state that causes later use of the objects to break
(e.g. if a newer version of __init__() sets some extra instance
variables -- unpickling doesn't generally call __init__, so these new
variables won't be set if they didn't exist in the old version). Etc.,
etc.

If you can solve your problem with a suitably hacked Unpickler
subclass that's fine with me, but I would personally use this
opportunity to change the app to some other serialization format that
is perhaps less general but more robust than pickle. I've been bitten
by too many pickle-related problems to recommend pickle to anyone...

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list