Boilerplate in rich comparison methods

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sat Jan 13 14:08:03 EST 2007


On Sat, 13 Jan 2007 10:04:17 -0600, Paul McGuire wrote:

> Just a side note on writing these comparison operators.  I remember when 
> learning Java that this was really the first time I spent so much time 
> reading about testing-for-identity vs. testing-for-equality.  The Java 
> conventional practice at the time was to begin each test-for-equality method 
> by testing to see if an object were being compared against itself, and if 
> so, cut to the chase and return True (and the converse for an inequality 
> comparison).  The idea behind this was that there were ostensibly many times 
> in code where an object was being compared against itself (not so much in an 
> explicit "if x==x" but in implicit tests such as list searching and 
> filtering), and this upfront test-for-identity, being very fast, could 
> short-circuit an otherwise needless comparison.
> 
> In Python, this would look like:
> 
> class Parrot:
>     def __eq__(self, other):
>         return self is other or self.plumage() == other.plumage()

[snip]

Surely this is only worth doing if the comparison is expensive?
Testing beats intuition, so let's find out...

class Compare:
    def __init__(self, x):
        self.x = x
    def __eq__(self, other):
        return self.x == other.x

class CompareWithIdentity:
    def __init__(self, x):
        self.x = x
    def __eq__(self, other):
        return self is other or self.x == other.x

Here's the timing results without the identity test:

>>> import timeit
>>> x = Compare(1); y = Compare(1)
>>> timeit.Timer("x = x", "from __main__ import x,y").repeat()
[0.20771503448486328, 0.16396403312683105, 0.16507196426391602]
>>> timeit.Timer("x = y", "from __main__ import x,y").repeat()
[0.20918107032775879, 0.16187810897827148, 0.16351795196533203]

And with the identity test:

>>> x = CompareWithIdentity(1); y = CompareWithIdentity(1)
>>> timeit.Timer("x = x", "from __main__ import x,y").repeat()
[0.20761799812316895, 0.16907095909118652, 0.16420602798461914]
>>> timeit.Timer("x = y", "from __main__ import x,y").repeat()
[0.2090909481048584, 0.1968839168548584, 0.16479206085205078]

Anyone want to argue that this is a worthwhile optimization? :)

> On the other hand, I haven't seen this idiom in any Python code that I've 
> read, and I wonder if this was just a coding fad of the time.
>
> Still, in cases such as Steven's Aardark class, it might be worth
> bypassing something that calls lots_of_work if you tested first to see
> if self is not other.

The comparison itself would have to be quite expensive to make it worth
the extra code.


-- 
Steven.




More information about the Python-list mailing list