pickle, cPickle, zlib, and the future

Scott Gilbert xscottgjunk at yahoo.com
Fri Mar 1 16:29:45 EST 2002


I'm serializing Python objects to one of three formats: Ascii, Binary,
or Compressed (a, b, or c).  I would like to be able to load them
using just one function that correctly guesses how they were pickled. 
This is similar to what I'm currently doing:

  def dump_a(o): return cPickle.dumps(o)

  def dump_b(o): return cPickle.dumps(o, 1)

  def dump_c(o): return zlib.compress(cPickle.dumps(o, 1))

  def load(s):
    if s[0] == 'x':
      return cPickle.loads(zlib.decompress(s))
    else:
      return cPickle.loads(s)

This only works if future enhancements to pickle/cPickle don't step on
the first characted being 'x', and only if zlib always returns a
string with the first characted 'x'.

Looking in cPickle.c and pickle.py, I see that capital 'X' is used for
type BINUNICODE, but no mention of lowercase 'x'.  I can't find
anything in zlib, but empirically it seems to be consistant.

I figure this was either planned or lucky.  If it was planned, a quick
comment in the source would relax my fears that my code won't break in
the next version of Python.  If it was lucky, it might be nice to jump
on the opportunity.

So can I rely on the first character 'x' being reserved for zlib, and
cPickle/pickle strings never starting with 'x'?

Cheers,
    -Scott Gilbert



(BTW: In cases somebody asks why I don't just prefix my dump strings
saying where they came from: The dumps can be rather large (maybe 10's
of megabytes), and I really don't want to have to create a multiple
megabyte substring from the original.)



More information about the Python-list mailing list