question about True values

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sat Oct 28 03:23:49 EDT 2006


On Fri, 27 Oct 2006 17:35:58 +0000, Antoon Pardon wrote:

> On 2006-10-27, Steven D'Aprano <steve at REMOVE.THIS.cybersource.com.au> wrote:
> 
>> But in this specific instance, I don't see any advantage to explicitly
>> testing the length of a list. Antoon might think that is sufficiently
>> polymorphic, but it isn't. He cares whether the object has zero _length_,
>> but for true polymorphism, he should be caring about whether the object is
>> _empty_. Not all empty objects have zero length, or even a length at all.
>> (E.g. binary trees.) That's why Python classes can use a __nonzero__
>> method, falling back on __len__ only if __nonzero__ is not defined.
> 
> Nobody can force you to write a container class that also provides a
> __len__ method. But if you don't provide one then IMO it is your class
> that deviates from standard practice. Mathematically, sets and
> directories have no length either, yet the len function does provide
> an answer if you give it a set or directory as an argument.

Do you mean dictionary?

Dictionaries, a.k.a. hash tables, aren't a standard mathematical data
type. Even if they were, they have two "lengths" (size really): the size
of the table, and the number of items in the table. In a high level
language like Python, you don't care about the size of the table (since it
will, I believe, automatically grow dynamically if you need it to), so
number of items is the only "size" (length) you could care about.


> So
> it seems that python has generalised the len function to provide
> the number of elements in the container. 

Sure. But what about a container where the number of elements isn't
well-defined, e.g. iterators? Or something like a binary tree, where
counting the number of items is relatively expensive, but telling whether
it is empty or not is cheaper than dirt?

Here's a real example: it can be expensive to count the number of files in
a directory -- on my PC, it takes almost a third of a second to count a
mere 15,000 files in a single directory. (There may be a more sensible
way of counting the number of files under Linux than ls | wc -l, but if
so I don't know it.) But why slog through 15,000 files if all you need to
know is if the directory is empty or not? As soon as you see one file, you
know it isn't empty. Stop counting! Who cares whether there is one file or
15,000 files?


 
> I have written a Tree class(*). It can be used as a drop in replacement
> anywhere where a directory is used, as long as there is a full order
> relationship in the key domain. That tree class provides a __len__
> method that will anser with the number of items in the tree just
> as a directory would and I see nothing wrong with that.

And I'm happy for you. But imagine a container object that maps a URL to
some piece of data fetched from the Internet. Counting the size of the
Internet is infeasible -- even if you were willing to try, it could take
*weeks* of computation to determine! But one can certainly tell if there
is an Internet out there or not. Such an object would always be True,
unless you had lost network connectivity.

My container object will work perfectly well with "if internet" but not
with "if len(internet) > 0". You could even iterate over it, sort
of, by following links from one site to another.

But why are you checking the length of a container before iterating over
it? If you are writing something like this:

if len(container) != 0:
    for item in container:
        do_something()

then just stop it!



> Of course I can't account for all possible ways someone wishes to
> write a class, but I don't see what is wrong with counting on
> the fact that an empty container has a length of zero.

Because you shouldn't assume containers have well-defined lengths unless
you actually care about the length, and you shouldn't assume that length
of zero implies "nothing to see here" unless you *know* that this is the
case. You should leave defining empty up to the container class itself.
Otherwise, you might be right 99 times in a hundred, but that hundredth
time will bite you.


> I can write a container class where the truth value of an object
> is independent of whether or not the object is empty and then
> the "if obj:" idiom will fail to provide true polymorphism too.

A class that deliberate breaks the semantics of Python truth testing just
for the sake of breaking code really is a pathological case. Polymorphism
doesn't mean "will work with anything without exception".


-- 
Steve.




More information about the Python-list mailing list