[Patches] [ python-Patches-421922 ] Implement rich comparison for dicts

Mon, 07 May 2001 14:44:53 -0700

Patches item #421922, was updated on 2001-05-06 20:10
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421922&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Tim Peters (tim_one)
>Assigned to: Tim Peters (tim_one)
Summary: Implement rich comparison for dicts

Initial Comment:
Dict comparison currently requires that the keys and 
values enjoy a total ordering, but dicts only require 
Py_EQ in other contexts.  The patch changes dict 
compares to rely on just Py_EQ too, and replaces the 
cmp()-like calls with comparison of raw object 
addresses.  Besides yielding a huge speedup on an 
artificial test case <wink>, it stops surprises like 
these:

>>> {1: 1j} == {1: 2j}
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, 
>, >=
>>> {1j: 0} == {2j: 0}
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, 
>, >=
>>>

After the patch:

>>> {1: 1j} == {1: 2j}
0
>>> {1j: 0} == {2j: 0}
0
>>> cmp({1: 1j}, {1: 1j})
0
>>> cmp({1j: 0}, {1j: 0})
0
>>>

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-05-07 14:44

Message:
Logged In: YES 
user_id=6380

Looks cool to me.  Go for it.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-07 13:42

Message:
Logged In: YES 
user_id=31435

Back to Guido.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-07 13:40

Message:
Logged In: YES 
user_id=31435

New patch attached.  As noted in the comment two back, this 
gives up on trying to embed Py_EQ in a total ordering.  
Instead the focii are on making == and != comparison work 
for dicts regardless of whether the keys and values are 
totally ordered, and on speeding == and != dict comparisons 
in any case.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-07 11:27

Message:
Logged In: YES 
user_id=31435

Assigned back to me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-07 11:26

Message:
Logged In: YES 
user_id=31435

Ya, this patch is dead.

WRT A<B and C>D while A==C and B==D, yes, I'm afraid that's 
possible, and that sucks bigtime.  The underlying problem 
is that I see no practical way to extend an equality 
relation (given by Py_EQ) to a total ordering.  For 
example, if "==" is determined by Py_EQ, and "<" by address 
comparison, then (forget dicts here for the moment) we can 
construct A, B1 and B2 s.t.

A < B1 == B2 < A

Then using those as keys and/or values in dicts can lead to 
inconsistent dict comparisons too (with this patch applied).

I had three use cases in mind:

1. Speed (in)equality testing.  A note Greg Wilson sent 
earlier this year reminded me that this is an important 
operation for sets (which he's implementing on top of 
dicts).

2. Ensure that (in)equality testing of dicts doesn't blow 
up just because the keys and/or values support only a 
partial ordering, or only (in)equality testing (complex 
numbers are just an extreme case of this).

3. Support sorting a list of dicts without blowing up, 
again if the keys and/or values support only a partial 
ordering.

I've given up on #3.  #1 and #2 could be accomplished by 
implementing the tp_richcompare slot for dicts instead:  if 
I know up-front that "equal: yes or no?" is the only 
outcome of interest, then a different, simpler and faster 
algorithm using only Py_EQ on keys and values would suffice.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-05-07 06:17

Message:
Logged In: YES 
user_id=6380

I have one question about this.  Will the comparison between
dicts using cmp() or <, <=, >, >= also depend on the raw
object address?  That would mean that (even in one program)
there could be two pairs of dictionaries (A,B) and (C,D)
where A<B and C>D while A==C and B==D.  I'm not sure I like
that.

What's the real use case where this matters?  (Either the
speed of == comparisons or the ability to ==-compare dicts
containing complex numbers.)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421922&group_id=5470