Interesting list Validity (True/False)

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sun May 13 09:24:45 EDT 2007


On Sat, 12 May 2007 21:50:12 -0700, mensanator at aol.com wrote:

>> > Actually, it's this statement that's non-sensical.
>>
>> > <quote>
>> > "if arg==True" tests whether the object known as arg is equal to the
>> > object known as True.
>> > </quote>
>>
>> Not at all, it makes perfect sense. X == Y always tests whether the
>> argument X is equal to the object Y regardless of what X and Y are.
> 
> Except for the exceptions, that's why the statement is wrong.

But there are no exceptions. X == Y tests for equality. If it returns
True, then the objects are equal by definition. That's what equal means in
Python.

One can abuse the technology to give nonsensical results:

class EqualToEverything(object):
    def __eq__(self, other):
        return True

>>> x = EqualToEverything()
>>> x == 1.0
True
>>> x == [2.9, "hello world"]
True

but that's no different from any language that allows you to override
operators.



>> > None of these four examples are "equal" to any other.
>>
>> That's actually wrong, as you show further down.
> 
> No, it's not, as I show further down.

But you show no such thing.

Or, to put it another way:

Did! Did not! Did! Did not! Did! Did not! ...


>> >>>> a = 1
>> >>>> b = (1,)
>> >>>> c = [1]
>> >>>> d = gmpy.mpz(1)

[snip]

>> >>>> a==d
>> > True
>>
>> See, a and d are equal.
> 
> No, they are not "equal". 

Of course they are. It says so right there: "a equals d" is true.


> Ints and mpzs should NEVER
> be used together in loops, even though it's legal.

Why ever not? If you need an mpz value in order to do something, and no
other data type will do, what would you suggest? Just give up and say
"Don't do this, because it is Bad, m'kay?"

> The ints
> ALWAYS have to be coerced to mpzs to perform arithmetic
> and this takes time...LOTS of it.

Really? Just how much time?

timeit.Timer("x == y", "import gmpy; x = 1; y = gmpy.mpz(1)").repeat()
timeit.Timer("x == y", "x = 1; y = 1").repeat()

I don't have gmpy installed here, so I can't time it, but I look forward
to seeing the results, if you would be so kind.

Even if it is terribly slow, that's just an implementation detail. What
happens when Python 2.7 comes out (or Python 3.0 or Python 99.78) and
coercion from int to mpz is lightning fast? Would you then say "Well,
int(1) and mpz(1) used to be unequal, but now they are equal?".

Me, I'd say they always were equal, but previously it used to be slow to
coerce one to the other.


> The absolute stupidest
> thing you can do (assuming n is an mpz) is:
> 
> while n >1:
>     if n % 2 == 0:
>         n = n/2
>     else:
>         n = 3*n + 1

Oh, I can think of much stupider things to do.

while len([math.sin(random.random()) for i in range(n)[:]][:]) > 1:
    if len( "+" * \
    int(len([math.cos(time.time()) for i in \
    range(1000, n+1000)[:]][:])/2.0)) == 0:
        n = len([math.pi**100/i for i in range(n) if i % 2 == 1][:])
    else:
        s = '+'
        for i in range(n - 1):
            s += '+'
        s += s[:] + ''.join(reversed(s[:]))
        s += s[:].replace('+', '-')[0:1]
        n = s[:].count('+') + s[:].count('-')



> You should ALWAYS do:
> 
> ZED = gmpy.mpz(0)
> ONE = gmpy.mpz(1)
> TWO = gmpy.mpz(2)
> TWE = gmpy.mpz(3)
> 
> while n >ONE:
>     if n % TWO == ZED:
>         n = n/TWO
>     else:
>         n = TWE*n + ONE
> 
> This way, no coercion is performed.

I know that algorithm, but I don't remember what it is called...

In any case, what you describe is a local optimization. Its probably a
good optimization, but in no way, shape or form does it imply that mpz(1)
is not equal to 1.


>> > And yet a==d returns True. So why doesn't b==c
>> > also return True, they both have a 1 at index position 0?
>>
>> Why should they return true just because the contents are the same?
> 
> Why should the int 1 return True when compared to mpz(1)?

Because they both represent the same mathematical number, where as a list
containing 1 and a tuple containing 1 are different containers. Even if
the contents are the same, lists aren't equal to tuples.


> a = [1]
> b = [1]
> 
> returns True for a==b?

That's because both are the same kind of container, and they both have the
same contents.


> After all, it returns false if b is [2],
> so it looks at the content in this case. So for numerics,
> it's the value that matters, not the type. And this creates
> a false sense of "equality" when a==d returns True.

There's nothing false about it. Ask any mathematician, does 1 equal 1.0,
and they will say "of course". 


>> A bag
>> of shoes is not the same as a box of shoes, even if they are the same
>> shoes.
> 
> Exactly. For the very reason I show above. The fact that the int
> has the same shoes as the mpz doesn't mean the int should be
> used, it has to be coerced.

Ints are not containers. An int doesn't contain values, an int is the
value.

Numeric values are automatically coerced because that's more practical.
That's a design decision, and it works well.

As for gmpy.mpz, since equality tests are completely under the control of
the class author, the gmpy authors obviously wanted mpz values to compare
equal with ints. 



>> Since both lists and tuples are containers, neither are strings or
>> numeric types, so the earlier rule applies: they are different types, so
>> they can't be equal.
> 
> But you can't trust a==d returning True to mean a and d are
> "equal". 

What does it mean then?


> To say the comparison means the two objects are
> equal is misleading, in other words, wrong. It only takes one
> turd to spoil the whole punchbowl.
> 
>>
>> gmpy.mpz(1) on the other hand, is both a numeric type and a custom class.
>> It is free to define equal any way that makes sense, and it treats itself
>> as a numeric type and therefore says that it is equal to 1, just like 1.0
>> and 1+0j are equal to 1.
> 
> They are equal in the mathematical sense, but not otherwise.

Since they are mathematical values, what other sense is meaningful?

> And to think that makes no difference is to be naive.

I never said that there was no efficiency differences. Comparing X with Y
might take 0.02ms or it could take 2ms depending on how much work needs
to be done. I just don't understand why you think that has a bearing on
whether they are equal or not.


-- 
Steven.




More information about the Python-list mailing list