question about True values

Sun Oct 29 07:22:59 EST 2006

On Sun, 29 Oct 2006 00:31:23 -0700, Carl Banks wrote:

> Steven D'Aprano wrote:
>> Carl Banks:
>> > Overall, your objections don't really apply, since you're arguing what
>> > ought to be whereas my argument is pragmatic.  Practically speaking, in
>> > realistic situations, "if len(a)>0" will work for a wider range of types
>> > than "if a:".
>>
>> Well, that's a quantitative claim you're making there. Have you
>> actually gone through, say, the built in types and checked how many
>> have a length versus how many work in a truth-context?
> 
> No, and it's irrelevant to my argument.

And yet you not only made the claim in the first place, but you also spend
a good half or three quarters of your response justifying your claim. You
obviously didn't think it was irrelevant when you first made the claim,
and you clearly still don't think it was irrelevant now that you've spent
time checking an arbitrary set of types. Either way, you're still wrong.

"if a:" will work with EVERY TYPE except those few (the only one?) like
numpy arrays which deliberately raise an exception because they don't make
sense in a Boolean context. (And even then, arguably numpy is doing the
wrong thing: the Pythonic behaviour would be for numpy arrays to all
be True.)

"if len(a)>0:" can only work with types that have lengths. It is a logical
necessity that the number of types that have lengths must be no bigger
than the number of types in total, numpy arrays notwithstanding.

> For some reason, people seem to think it's absolutely wonderful that
> you can write "if X:" somewhere, and that this "works" whether X is a
> list or an int or any other object, as if "working" for both ints and
> lists was actually useful.
> 
> Well, it's not.
> 
> You see, presumably you have to do something with X.  And in realistic
> code, there's not a lot of stuff you can do with that works for both
> container types like lists and dict, and atomic types like ints and
> floats.  

Who cares whether or not your function accepts mixed data types like lists
and floats? Why do you think it matters whether the one function has to
operate on both sequences and atomic types?

What *does* matter is that, regardless of whether you are operating on
sequences, mappings, numeric types, the same syntax works.

That's why (for example) Python allows + to operate on lists, or strings,
or floats -- but that doesn't imply that you can add a list to a float,
or that the same code will operate happily on both lists and floats. You,
the developer, don't need to care what data types you are adding, you just
use the same syntax: a+b. The objects themselves know what addition
means to themselves. In practice, string addition threatens to be so
inefficient that you might wish to avoid doing it, but as a general
principle, Python idioms and syntax are as type-independent as possible.

That's why obj[index] works whether you are getting or setting or deleting
an item, whether it is a sequence or a mapping or a custom class. As much
as possible, you don't need to know what the object is to know what syntax
to use. The idiom for item-access should always be obj[index], not
obj[index] for some types, obj.get(index) for some others, and
getitem(obj, index) for the rest.

(Custom classes are, of course, free to break these guidelines, just as
the numpy developers were free to have __nonzero__ raise an exception. One
hopes they have good reasons for doing so.)

And that's why Python not only allows but prefers "if any_object_at_all:"
over type-dependent tricks. The *object itself* is supposed to know
whether it is equivalent to true or false (or, in rare cases like numpy
arrays, raise an exception if truth-testing doesn't mean anything for the
type). You, the developer, are not supposed to spend your time wondering
how to recognise if the object is equivalent to false, you just ask the
object.

If some well-meaning person argued that the correct way to test for a
numeric value of zero/non-zero was like this:

if x*2 != x:
    # x must be non-zero

you'd fall down laughing (at least, I hope you'd fall down laughing). Yes,
such a test works, but you'd be crazy to do such unnecessary work if all
you want to know if x was nonzero. (And beware of floating point rounding:
what if x is tiny?)

And then someone passes a class that implements infinity or the alephs to
this function (that could simply mean an INF float on a platform that
supports the IE standard) and the code wrongly decides that INF is zero.
Oops.

"if x == 0:" is better, but still not good enough, because you're
still making assumptions about the data type. What if it is a numeric type
that works with interval arithmetic? You don't know what the right way to
test for a False interval is -- but the interval class itself will know.

And that's the point -- you're making unnecessary assumptions about the
data, *and* doing unnecessary calculations based on those assumptions. At
best, you're wasting your time. At worst, you're introducing bugs.

It isn't often that I make an appeal to authority, but this is one of
them. No offense, but when it comes to language design its a brave or
foolish programmer who bucks the language idioms that Guido chose.

-- 
Steven.