Object Diffs

Terry Reedy tjreedy at udel.edu
Mon Aug 8 20:23:42 EDT 2011


On 8/8/2011 3:50 PM, Croepha wrote:
> Hello Python list:

Hi

> I am doing research into doing network based propagation of python
> objects.In order to maintain network efficiency. I want to just
> send the differences of python objects, I was wondering if there
> was/is any other research or development in this area? I was thinking
> that I could look at how pickle works an implement a diff system
> there, or I could actually create a pickle of the object and then use
> difflib to compare the flattened text...

The problem cannot be tackled efficiently 'in general'. The important 
issue is to choose a representation that localizes differences in a 
particular application so that the diff is significantly smaller than 
just sending the changed version.

Source code systems regard files as sequences of lines. That is ok for 
formatted source code, not so good for prose in paragraphs where a small 
change in one line may cause re-wrapping that changes 20 lines. 
Compressed representations that spread the effect of localized changes 
are also bad for local diffs. A representation that arbitrarily orders 
items (and re-orders items at will), is also bad for local diffs.

I have no idea how stable and local pickles are, but I know they were 
not designed for diff-ing. Json or yaml representations might do better 
if applicable.

-- 
Terry Jan Reedy




More information about the Python-list mailing list