enhancement request: make py3 read/write py2 pickle format

Devin Jeanpierre jeanpierreda at gmail.com
Wed Jun 10 19:48:07 EDT 2015


On Wed, Jun 10, 2015 at 4:39 PM, Devin Jeanpierre
<jeanpierreda at gmail.com> wrote:
> On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>> On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:
>>
>>> The problem is that there are two different ways repr might write out
>>> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle
>>
>>
>> Not if one compares objects rather than string representations of objects.
>> I am strongly of the view that code and tests should be written to directly
>> compare objects as much as possible.
>
> For serialization formats that always output the same string for the
> same data (like text format protos), there is no practical difference
> between the two, except that if you're comparing text, you can easily
> supply a diff to update one to match the other.

Ugh, there's also the fiddly difference between what goes in and what
you read. A serialized data structure might contain lots of data that
is ignored by the deserializer (in protobuf), or it might contain data
which can't be loaded by the deserializer or produces weird /
incorrect results. Being able to inspect and test the serialized data
separately from the deserialized data is useful in that regard, so
that you know where the failure lies, but it's sort of fuzzy.

Some examples of where this crops up: pickles after you've moved a
class, JSON encoders that try to be clever and output invalid JSON,
protocol buffers with unexpected fields.

Overall, though, the diff thing is probably the bigger reason everyone
wants to do this sort of thing with serialized data. If you do it
right and are principled about it, I don't see a problem with it.

-- Devin



More information about the Python-list mailing list