len() should always return something

Fri Jul 24 16:02:24 EDT 2009

On Fri, Jul 24, 2009 at 12:05 PM, Steven
D'Aprano<steve at remove-this-cybersource.com.au> wrote:
> On Fri, 24 Jul 2009 00:02:28 -0700, Chris Rebert wrote:
>
>> On Thu, Jul 23, 2009 at 11:35 PM, Dr. Phillip M.
>> Feldman<pfeldman at verizon.net> wrote:
>>>
>>> Some aspects of the Python design are remarkably clever, while others
>>> leave me perplexed. Here's an example of the latter: Why does len()
>>> give an error when applied to an int or float? len() should always
>>> return something; in particular, when applied to a scalar, it should
>>> return a value of 1. Of course, I can define my own function like this:
>>>
>>> def mylen(x):
>>>   if isinstance(x,int) or isinstance(x,float): return 1 return len(x)
>>>
>>> But, this shouldn't be necessary.
>>
>> The problem is that redefining len()/length/size that way would violate
>> several principles of Python's design (The "Zen" of Python -
>> http://www.python.org/dev/peps/pep-0020/).
>>
>> Specifically:
>> - Explicit is better than implicit.
>> - Special cases aren't special enough to break the rules.
>> - Errors should never pass silently.
>> - In the face of ambiguity, refuse the temptation to guess.
>
>
> Chris, I'm curious why you think that these Zen are relevant to the OP's
> complaint.

To explain in more detail:

> Re explicit vs implicit, len(42) is just as explicit as len([42, 23]).

If you want a collection (something that has length), then one should
explicitly create one, not implicitly have a singleton value act like
it's a pseudo-collection.
I admit I'm somewhat conflating this principle with the anti-ambiguity
principle, but the two are related, imho.

> Arguably (I wouldn't argue this, but some people might) ints aren't
> "special enough" to break the rule that len(obj) should always return
> something.

Except that's not the current rule. The current rule is that it's
defined only for collections.
One would instead have to argue why ints are special enough to have
len() defined despite not being collections.
I think the point made by Grant Edwards is instructive. len(x) = 1
typically implies list(x)[0] and similar should be valid. Altering the
behavior would invalidate that theorem and cause quite a bit of code
upheaval, all just to save the OP from typing one pair of []s.

> (I don't actually agree, but some people might be able to produce a
> coherent argument why len() should apply equally to all objects.)

Well, yes, this /whole/ "argument" is entirely academic; the behavior
is extremely unlikely to change, we're just trying to give ex post
facto rationales for pedagogical purposes. :)

> Re errors passing silently, the OP doesn't believe that len(42) should be
> an error, so that's not relevant.

True, it would not directly silence an error, but defining len() on
scalars would tend towards obscuring errors in code that incorrectly
treats scalars as collections.

> And there's nothing ambiguous about len(42).

Really? What is its value then? I think arguments of varying quality
can be made for:
1 - as the OP and those from array programming languages would suggest
2 - the number of decimal digits in 42, if one was feeling Perlish
6 - the minimum number of bits necessary to represent 42 in binary
32 (or 64, depending on your CPU) - the number of bits necessary to
represent an int (obviously breaks down a bit with arbitrary-magnitude
ints)
undefined - i.e. it causes an error, the current behavior; asking for
the length of a non-collection is "nonsensical" or "absurd"

> I agree with the current Python behaviour, but I don't think there's
> anything in the Zen to support it. As far as I know, there is no

The problem and strength of the Zen is that it's all about how you
interpret it. :-)

Cheers,
Chris
-- 
http://blog.rebertia.com