[Python-ideas] Implement comparison operators for range objects

Thu Oct 13 19:30:19 CEST 2011

On Wed, Oct 12, 2011 at 9:53 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Oct 13, 2011 at 1:18 PM, Guido van Rossum <guido at python.org> wrote:
>> FWIW, I don't think the argument from numeric comparisons carries
>> directly. The reason numeric comparisons (across int, float and
>> Decimal) ignore certain "state" of the value (like precision or type)
>> is that that's how we want our numbers to work.
>>
>> The open question so far is: How do we want our ranges to work? My
>> intuition is weak, but says: range(0) != range(1, 1) != range(1, 1, 2)
>> and range(0, 10, 2) != range(0, 11, 2); all because the arguments
>> (after filling in the defaults) are different, and those arguments can
>> come out using the start, stop, step attributes (once we implement
>> them :-).
>
> Between this and Raymond's point about slicing permitting easy and
> cheap normalisation of endpoints, I'm convinced that, if we add direct
> comparison of ranges at all, then start/stop/step comparison is the
> way to go.

Thanks. Maybe I can nudge you a little more in the direction of my
proposal by speaking about equivalence classes. A proper == function
partitions the space of all objects into equivalence classes, which
are non-overlapping sets such that all objects within one equivalence
class are equal to each other, while no two objects in different
classes are equal. (Let's leave NaN out of it for now; it does not
have a "proper" == function.) There's a nice picture on this Wikipedia
page: http://en.wikipedia.org/wiki/Equivalence_relation

A trivial collection of equivalence classes is one where each object
is in its own equivalence class. That's comparison-by-identity. It
isn't very useful because we already have another operator that does
the same partitioning. A more useful partitioning is the one which
puts all range objects with the same start/stop/step triple into the
same equivalence class. This is the one I (still) like best.

Interestingly, the one that got the most votes so far is a proper
"extension" of this one, in that equivalence according to equal
start/stop/step triples implies equivalence according to this weaker
definition. That's nice, because it means that there will probably be
many use cases where either definition suffices (such as all use cases
that only care about non-empty ranges with step==1).

(Note: __hash__ needs to create equivalence classes that are proper
extensions of those created by __eq__. In terms of the Wikipedia
picture, an extension is allowed to merge some equivalence classes but
not to split them.)

BTW, I like Raymond's observation, and I agree that we should add
slicing to range(), given that it already supports indexing; and
slicing is a nice way to normalize the range. I just don't think that
the status quo is better than either of the two proposed definitions
for __eq__.

Finally. Still waiting for actual use cases.

>> PS. An (unrelated) oddity with range and Decimal:
>>
>>>>> range(Decimal(10))
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>> TypeError: 'Decimal' object cannot be interpreted as an integer
>>>>> range(int(Decimal(10)))
>> range(0, 10)
>>>>>
>>
>> So int() knows something that range() doesn't. :-)
>
> Yeah, range() wants to keep floats far away, so it only checks
> __index__, not __int__. So Decimal gets handled the same way float
> does (i.e. not allowed directly, but permitted after explicit coercion
> to an integer).

Sorry, it all makes sense now. Please move on. Nothing to see here. :-)

-- 
--Guido van Rossum (python.org/~guido)