cPickle.dumps differs from Pickle.dumps; looks like a bug.

Chris Mellon arkanes at gmail.com
Wed May 16 18:39:08 EDT 2007


On 5/16/07, Daniel Nogradi <nogradi at gmail.com> wrote:
> > > I've found the following strange behavior of cPickle. Do you think
> > > it's a bug, or is it by design?
> > >
> > > Best regards,
> > > Victor.
> > >
> > > from pickle import dumps
> > > from cPickle import dumps as cdumps
> > >
> > > print dumps('1001799')==dumps(str(1001799))
> > > print cdumps('1001799')==cdumps(str(1001799))
> > >
> > > outputs
> > >
> > > True
> > > False
> > >
> > > vicbook:~ victor$ python
> > > Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
> > > [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
> > > Type "help", "copyright", "credits" or "license" for more information.>>>
> > quit()
> > >
> > > vicbook:~ victor$ uname -a
> > > Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
> > > PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
> >
> > If you unpickle though will the results be the same? I suspect they
> > will be. That should matter most of all (unless you plan to compare
> > objects' identity based on  their pickled version.)
>
> The OP was not comparing identity but equality. So it looks like a
> real bug, I think the following should be True for any function f:
>
> if a == b: f(a) == f(b)
>
> or not?
>

Obviously not, in the general case. random.random(x) is the most
obvious example, but there's any number functions which don't return
the same value for equal inputs. Take file() or open() - since you get
a new file object with new state, it obviously will not be equal even
if it's the same file path.

For certain inputs, cPickle doesn't print the memo information that is
used to support recursive and shared data structures. I'm not sure how
it tells the difference, perhaps it has something to do with
refcounts. In any case, it's an optimization of the pickle output, not
 a bug.



More information about the Python-list mailing list