Coding style

Tue Jul 18 13:06:09 EDT 2006

Bruno Desthuilliers <onurb at xiludom.gro> schrieb:
> Carl Banks wrote:
>> Bruno Desthuilliers wrote:
>> 
>> I'm well aware of Python's semantics, and it's irrelvant to my
>> argument.
[...]
>> If the language
>> were designed differently, then the rules would be different.
>
> Totally true - and totally irrelevant IMHO.

I strongly advise not to treat each others thoughts as irrelevant.
Assuming the opposite is a base of every public dicussion forum.

I assume here is a flaw in Python. To explain this, I'd like to
make Bruno's point clearer. As usually, code tells more then
thousand words (an vice versa :-)).

Suppose you have two functions which somehow depend on the emptyness
of a sequence. This is a stupid example, but it demonstrates at
least the two proposed programming styles:

------------------------------------------------------
>>> def test1(x): 
...     if x:
...             print "Non-Empty"
...     else:
...             print "Empty"
... 
>>> def test2(x):
...     if len(x) > 0:
...             print "Non-Empty"
...     else:
...             print "Empty"
------------------------------------------------------

Bruno pointed out a subtle difference in the behaviour of those
functions:

------------------------------------------------------
>>> a = []     
>>> test1(a)
Empty
>>> test1(iter(a))
Non-Empty
>>> test2(a)
Empty
>>> test2(iter(a))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in test2
TypeError: len() of unsized object
------------------------------------------------------

While test1() returns a wrong/random result when called with an
iterator, the test2() function breaks when beeing called wrongly.

So if you accidently call test1() with an iterator, the program
will do something unintended, and the source of that bug will be
hard to find. So Bruno is IMHO right in calling that the source
of a suptle bug.

However, if you call test2() with an iterator, the program will
cleanly break early enough with an exception. That is generally
wanted in Python. You can see this all over the language, e.g.
with dictionaries:

------------------------------------------------------
>>> d = { 'one': 1 }
>>> print d['one']
1
>>> print d['two']
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyError: 'two'
------------------------------------------------------

Python could have been designed to return None when d['two'] has been
called, as some other (bad) programming languages would. This would
mean that the problem will occur later in the program, making it easy
to produce a subtle bug. It would be some effort to figure out the
real cause, i.e. that d had no entry for 'two'.

Luckily, Python throws an exception (KeyError) just at the original
place where the initial mistake occured. If you *want* to get None in
case of a missing key, you'll have to say this explicitly:

------------------------------------------------------
>>> print d.get('two', None)
None
------------------------------------------------------

So maybe "bool()" should also break with an exception if an object
has neither a __nonzero__ nor a __len__ method, instead of defaulting
to True. Or a more strict variant of bool() called nonempty() should
exist.

Iterators don't have a meaningful Boolean representation, because
phrases like "is zero" or "is empty" don't make sense for them. So
instead of answering "false", an iterator should throw an exception
when beeing asked whether he's empty.

If a function expects an object to have a certain protocol (e.g.
sequence), and the given object doesn't support that protocol,
an exception should be raised. This usually happens automatically
when the function calls a non-existing method, and it plays very
well with duck typing.

test2() behaves that way, but test1() doesn't. The reason is a
sluttery of Python. Python should handle that problem as strict
as it handles a missing key in a dictionary. Unfortunately, it
doesn't.

I don't agree with Bruno that it's more natural to write
    if len(a) > 0:
    ...
instead of
    if a:
    ...

But I think that this is a necessary kludge you need to write
clean code. Otherwise you risk to create subtle bugs. This advise,
however, only applies when your function wants a sequence, because
only in that can expect "len(a)" to work.

I also agree with Carl that "if len(a) > 0" is less universal than
"if a", because the latter also works with container-like objects
that have a concept of emptiness, but not of length.

However, this case is less likely to happen than shooting yourself
in the foot by passing accidently an iterator to the function
without getting an exception. I think, this flaw in Python is deep
enough to justify the "len() > 0" kludge.

IMHO, that flaw of Python should be documented in a PEP as it violates
Python's priciple of beeing explicit. It also harms duck typing.

Greets,

    Volker

-- 
Volker Grabsch
---<<(())>>---
Administrator
NotJustHosting GbR