cPickle.dumps differs from Pickle.dumps; looks like a bug.

Daniel Nogradi nogradi at gmail.com
Wed May 16 18:51:42 EDT 2007


> > > > I've found the following strange behavior of cPickle. Do you think
> > > > it's a bug, or is it by design?
> > > >
> > > > Best regards,
> > > > Victor.
> > > >
> > > > from pickle import dumps
> > > > from cPickle import dumps as cdumps
> > > >
> > > > print dumps('1001799')==dumps(str(1001799))
> > > > print cdumps('1001799')==cdumps(str(1001799))
> > > >
> > > > outputs
> > > >
> > > > True
> > > > False
> > > >
> > > > vicbook:~ victor$ python
> > > > Python 2.5 (r25:51918, Sep 19 2006, 08:49:13)
> > > > [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
> > > > Type "help", "copyright", "credits" or "license" for more
> information.>>>
> > > quit()
> > > >
> > > > vicbook:~ victor$ uname -a
> > > > Darwin vicbook 8.9.1 Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00
> > > > PST 2007; root:xnu-792.18.15~1/RELEASE_I386 i386 i386
> > >
> > > If you unpickle though will the results be the same? I suspect they
> > > will be. That should matter most of all (unless you plan to compare
> > > objects' identity based on  their pickled version.)
> >
> > The OP was not comparing identity but equality. So it looks like a
> > real bug, I think the following should be True for any function f:
> >
> > if a == b: f(a) == f(b)
> >
> > or not?
> >
> 
> Obviously not, in the general case. random.random(x) is the most
> obvious example, but there's any number functions which don't return
> the same value for equal inputs. Take file() or open() - since you get
> a new file object with new state, it obviously will not be equal even
> if it's the same file path.

Right, sorry about that, posted too quickly :)
I was thinking for a while about a deterministic

> For certain inputs, cPickle doesn't print the memo information that is
> used to support recursive and shared data structures. I'm not sure how
> it tells the difference, perhaps it has something to do with
> refcounts. In any case, it's an optimization of the pickle output, not
>  a bug.

Caching?

>>> from cPickle import dumps
>>> dumps('0') == dumps(str(0))
True
>>> dumps('1') == dumps(str(1))
True
>>> dumps('2') == dumps(str(2))
True
........
........
>>> dumps('9') == dumps(str(9))
True
>>> dumps('10') == dumps(str(10))
False
>>> dumps('11') == dumps(str(11))
False


Daniel



More information about the Python-list mailing list