What's the cleanest way to compare 2 dictionary?

John Henry john106henry at hotmail.com
Thu Aug 10 16:12:06 EDT 2006


John,

Yes, there are several scenerios.

a) Comparing keys only.

That's been answered (although I haven't gotten it to work under 2.3
yet)

b) Comparing records.

Now it gets more fun - as you pointed out.  I was assuming that there
is no short cut here.  If the key exists on both set, and if I wish to
know if the records are the same, I would have to do record by record
comparsion.  However, since there are only a handful of records per
key, this wouldn't be so bad.  Maybe I just overload the compare
operator or something.

John Machin wrote:
> John Henry wrote:
> > Hi list,
> >
> > I am sure there are many ways of doing comparision but I like to see
> > what you would do if you have 2 dictionary sets (containing lots of
> > data - like 20000 keys and each key contains a dozen or so of records)
> > and you want to build a list of differences about these two sets.
> >
> > I like to end up with 3 lists: what's in A and not in B, what's in B
> > and not in A, and of course, what's in both A and B.
> >
> > What do you think is the cleanest way to do it?  (I am sure you will
> > come up with ways that astonishes me  :=) )
> >
>
> Paddy has already pointed out a necessary addition to your requirement
> definition: common keys with different values.
>
> Here's another possible addition: you say that "each key contains a
> dozen or so of records". I presume that you mean like this:
>
> a = {1: ['rec1a', 'rec1b'], 42: ['rec42a', 'rec42b']} # "dozen" -> 2 to
> save typing :-)
>
> Now that happens if the other dictionary contains:
>
> b = {1: ['rec1a', 'rec1b'], 42: ['rec42b', 'rec42a']}
>
> Key 42 would be marked as different by Paddy's classification, but the
> values are the same, just not in the same order. How do you want to
> treat that? avalue == bvalue? sorted(avalue) == sorted(bvalue)? Oh, and
> are you sure the buckets don't contain duplicates? Maybe you need
> set(avalue) == set(bvalue). What about 'rec1a' vs 'Rec1a' vs 'REC1A'?
>
> All comparisons are equal, but some comparisons are more equal than
> others :-)
> 
> Cheers,
> John




More information about the Python-list mailing list