Mixing protocols in pickles?

Erik Max Francis max at alcyone.com
Thu Jan 22 20:44:16 EST 2004


Is there any prohibition against mixing different protocols within the
same pickle?  I don't see anything about this in the Python Library
Reference and, after all, the pickle.dump function takes a protocol
argument for each time it's called.  (This is in Python 2.3.3.)

I have a pickle containing two objects: a tag string and a (large)
object containing many children.  The identifying string is there so
that you can unpickle it and decide whether you really want to unpickle
the whole data object.

So I thought it would be clever to write the tag string with protocol 0
so it would show up in a file viewer as plain text and then write the
rest of the data with protocol 1 (or 2; it doesn't use new-style
classes, though).  I open the file in binary mode and then dump the tag
string in protocol 0, then the (big) instance data in protocol 1.  When
loading time comes around, they're just both loaded in the same order:

def load(filename=DEFAULT_FILENAME):
    try:
        inputFile = gzip.GzipFile(filename + COMPRESSED_EXTENSION, 'rb')
    except IOError:
        inputFile = file(filename, 'rb')
    tag = pickle.load(inputFile)
    if DEBUG:
        print >> sys.stderr, "Tag: %s" % tag
    system = pickle.load(inputFile)
    inputFile.close()
    return system

def save(system, tag, filename=DEFAULT_FILENAME, protocol=1,
compressed=False):
    if compressed:
        outputFile = gzip.GzipFile(filename + COMPRESSED_EXTENSION,
'wb')
    else:
        outputFile = file(filename, 'wb')
    pickle.dump(tag, outputFile, 0) # write the tag in text
    pickle.dump(system, outputFile, protocol)
    outputFile.close()

This works fine on Unix, but on Windows it generates the (utterly
puzzling) error:

C:\My Documents\botec-0.1x1>python -i ./botex.py
Tag: default.botec:20040110:11175:ebec37a7632cc7176ff359a3754750ec:0.1x1
Traceback (most recent call last):
  File "./botex.py", line 70, in ?
    SYSTEM = init()
  File "./botex.py", line 15, in init
    return load()
  File "C:\My Documents\botec-0.1x1\botec.py", line 1666, in load
    system = pickle.load(inputFile)
ImportError: No Module named copy_reg
>>>

... made extra puzzling because you can type `import copy_reg' on that
prompt and it will import fine.  Googling for this error comes up with a
few scattered complaints but nothing coherent about the cause.  When I
modify the dumping routine to use the same protocol (1) for both dumps,
the problem goes away and everything works fine.

So I guess there are a few questions:  Why is the error being generated
so obscure and seemingly incorrect?  Is it really the case that mixing
multiple protocols within the same pickle isn't allowed, or is this
truly a bug (or, say, a Windows-specific problem because protocol 0
requires that the pickle file be in text mode?)?

-- 
 __ Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
/  \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ Love is the most subtle form of self-interest.
    -- Holbrook Jackson



More information about the Python-list mailing list