Bools and explicitness [was Re: PyWart: The problem with "print"]

Tue Jun 4 11:44:11 EDT 2013

On Tuesday, June 4, 2013 12:39:59 AM UTC-5, Steven D'Aprano wrote:
> On Mon, 03 Jun 2013 18:37:24 -0700, Rick Johnson wrote:

> Consider a simple thought experiment. Suppose we start with a sequence of 
> if statements that begin simple and get more complicated:
> if a == 1: ...
> if a == 1 and b > 2*c: ...
> if a == 1 and b > 2*c or d%4 == 1: ...
> if a == 1 and b > 2*c or d%4 == 1 and not (d**3//7)%3 == 0: ...
> I don't believe that any of these tests are improved by adding an 
> extraneous "== True" at the end:
> if (a == 1) == True: ...
> if (a == 1 and b > 2*c) == True: ...
> if (a == 1 and b > 2*c or d%4 == 1) == True: ...
> if (a == 1 and b > 2*c or d%4 == 1 and not (d**3//7)%3 == 0) == True: ...

And i agree!

You are misunderstanding my very valid point. Post-fixing a
"== True" when truth testing a *real* Boolean (psst: that's
a True or False object) is superfluous, I'm referring to
truth testing non-Boolean values. So with that in mind, the
following is acceptably "explicit enough" for me:

    a = True
    if a:
        do_something()

However, since Python allows implicit conversion to Boolean
for ALL types, unless we know for sure, beyond any
reasonable doubt, that the variable we are truth testing is
pointing to a True or False object, we are taking too many
chances and will eventually create subtle bugs.

    a = " "
    if a:
        do_something()

When if write code that "truth tests", i expect that the
value i'm testing is a True or False object, not an empty
list that *magically* converts to False when i place an "if"
in front of it, or a list with more members that magically
converts to True when i place an "if" in front of it.

This implicit conversion seems like a good idea at first,
and i was caught up in the hype myself for some time: "Hey,
i can save a few keystrokes, AWESOME!". However, i can tell
you with certainty that this implicit conversion is folly.
It is my firm belief that truth testing a value that is not
a Boolean should raise an exception. If you want to convert
a type to Boolean then pass it to the bool function:

    lst = [1,2,3]
    if bool(lst):
        do_something

This would be "explicit enough"

> If you are unfamiliar with Python, then you have to learn what the 
> semantics of "if lst" means. Just as you would have to learn what 
> "if len(lst) > 0" means.

Again, i understand the folly of "implicit Boolean
conversion" just fine.

> > I prefer to be explicit at the cost of a few keystrokes:
> >   if len(lst) > 0:
> This line of code is problematic, for various reasons:
> - you're making assumptions about the object which are unnecessary;
> - which breaks duck-typing;
> - and risks doing too much work, or failing altogether.
> You're looking up the length of the lst object, but you don't really care 
> about the length. 

Yes i do care about the length or i would not have asked.
I'm asking Python to tell me if the iterable has members,
amd if it does, i want to execute a block of code, if it
does not, i want to do nothing. But i'm also informing the
reader of my source code that the symbol i am truth testing
is expected to be an iterable with a __len__ method.

"if lst" does not give me the same answer (or imply the same
meaning to a reader), it merely tells me that the implict
conversion has resulted in a True value, but what if the lst
symbol is pointing to a string? Then i will falsely believe
i have a list with members when i actually have a string
with length greater than zero.

> You only care about whether there is something there or 
> not, whether lst is empty or not. It makes no difference whether lst 
> contains one item or one hundred million items, and yet you're asking to 
> count them all. Only to throw that count away immediately!

I agree. Summing the list members just to guarantee that the
iterable has members is foolish, however, python gives me no
other choice IF i want to be "explicit enough". In a
properly designed language, the base iterable object would
supply a "hasLength" or "hasMembers" method that would
return a much faster check of:

    try:
        iterable[0]
    except IndexError:
        return False
    else:
        return True

That check would guarantee the iterable contained at least
one member without counting them all.

> Looking at the length of a built-in list is cheap, but why assume it is a 
> built-in list? Perhaps it is a linked list where counting the items 
> requires a slow O(N) traversal of the entire list. Or some kind of lazy 
> sequence that has no way of counting the items remaining, but knows 
> whether it is exhausted or not.

Yes, but the problem is not "my approach", rather the lack
of proper language design (my apologizes to the "anointed
one". ;-)

> The Python way is to duck-type, and to let the lst object decide for 
> itself whether it's empty or not:
> if lst: ...
> not to make assumptions about the specific type and performance of the 
> object.

Well Steven, in the real world sometimes you have no other
choice. I don't have time to read and comprehend thousands
of lines of code just to use a simple interface. We are all
aware that:

  "Look Before You Leap"

is always a slower method than: 

  "It's Easier to Ask Forgiveness Than Permission"

When i am writing code i prefer to be "explicit enough" so
that IF my assumptions about the exact type of an object are
incorrect, the code will fail quickly enough that i can
easily find and correct the problem. In this manner i can
develop code much faster because i do not need to understand
the minutia of an API in order to wield it. On the contrary,
Implicit Conversion to Boolean is a bug producing nightmare
that requires too much attention to minutia.

> > Consider the following:
> >  What if the symbol `value` is expected to be a list, however, somehow
> >  it accidentally got reassigned to another type. If i choose to be
> >  implicit and use: "if value", the code could silently work for a type i
> >  did not intend, therefore the program could go on for quite some time
> >  before failing suddenly on attribute error, or whatever.
> `if len(lst) > 0` also works for types you don't intend. Any type that 
> defines a __len__ method which returns an integer will do it.
> Tuples, sets and dicts are just the most obvious examples of things that 
> support len() but do not necessarily support all the things you might 
> wish to do to a list.

Agreed. 

The "if len(var) > 0" will return True for ANY object that
includes a __len__ method. This test is fine if you want to
test iterables generically, however, if you want to be
specific about testing lists you could not rely on that code
because strings and all other iterables would return the
same "truthy" or "falsey" value. But how do we solve this
issue? I don't want to see this:

    if isinstance(var, list) and len(var) > 0:
       do_something()

But we are really ignoring the elephant in the room. Implict
conversion to Boolean is just a drop in the bucket compared
to the constant "shell game" we are subjected to when
reading source code. We so naively believe that a symbol
named "lst" is a list object or a symbol "age" is a integer,
when we could be totally wrong! This is the source of many
subtle bugs!!!

There must be some method by which we can truth test an
iterable object and verify it has members, but do so in a
manner that is valid for all types AND exposes the "expected
type" in the method name. hmm...

Adding a method like "is_valid" to every object can seem
logical, however, this can fail just as miserably as
Python's current implicit bool. And, more disastrously, an
"is_valid" method is not going to raise an error (where it
should) because it works for all types.

What we need is a method by which we can validate a symbol
and simultaneously do the vaidation in a manner that will
cast light on the type that is expected. In order for this
to work, you would need validators with unique "type names"

    if var.is_validList():
    elif var.is_validString():
    elif var.is_vaildTuple():
    elif var.is_validInteger():
    elif var.is_validFloat():
    elif var.is_validDict():
    etc...

By this manner, we can roll three common tests into one
method:

    * boolean conversion
    * member truthiness for iterables
    * type checking

But most importantly, we destroy implicitly and and be
"explicit enough", but not so explicit that our fingers
hurt.

*school-bell-rings*

PS: Damn i'm good! I believe the BDFL owes me a Thank You
email for this gold i just dropped on the Python community.
Flattery is welcome. Pucker up!