len() should always return something

Sat Jul 25 03:49:12 EDT 2009

On Jul 23, 11:35 pm, "Dr. Phillip M. Feldman" <pfeld... at verizon.net>
wrote:
> Some aspects of the Python design are remarkably clever, while others leave
> me perplexed. Here's an example of the latter: Why does len() give an error
> when applied to an int or float? len() should always return something; in
> particular, when applied to a scalar, it should return a value of 1. Of
> course, I can define my own function like this:
>
> def mylen(x):
>    if isinstance(x,int) or isinstance(x,float): return 1
>    return len(x)
>
> But, this shouldn't be necessary.

I knew that you were coming from Matlab as soon as I saw the subject
line.

I use both Matlab and Python very often, and I understand the design
decisions behind both (insofar as Matlab can be considered a
"design").

The thing to keep in mind about Python is that it's a general purpose
language, not just for mathematical computation, and there are some
things in Python that are a bit suboptimal for math calculations.
This is one of them: when the types of objects you are dealing with
most often are numbers, then there's not much chance for ambiguity
when treating a scalar as a 1x1 matrix.

However, when doing general purpose programming, you often use lots of
types besides numbers, some of which are containers themselves.  This
creates the possibility of ambiguity and inconsistent behavior.  Let's
consider your example function.

def dumbfunc(xs):
   for x in xs:
      print x

You can use it on an list, like this, and there is no ambiguity:

dumbfunc([1,2,3])

Now suppose Python allowed numbers to be treated as degenerate
sequences of length 1.  Then you could pass a number, and it would
work as you expect:

dumbfunc(1)

However, because Python is general purpose, you might also want to
pass other types around, such as strings.  Say you wanted to print a
list of string, you would do this, and it would work as expected:

dumbfunc(["abc","def","ghi"])

Now, the problem.  In the numbers example, you were able to treat a
single number as a degenerate list, and by our supposition, it
worked.  By analogy, you would expect to be able to do the same with a
list of strings, like this:

dumbfunc("abc")

Whoops: doesn't work.  Instead of printing one string "abc" it prints
three strings, "a", "b", and "c".

By allowing scalar numbers to act as degenerate lists, you've
introduced a very bad inconsistency into the language, you've made it
difficult to learn the language by analogy.

For a general purpose language there is really no second-guessing this
design decision.  It would be unequivocally bad to have atomic types
act like sequences.  The reason it isn't so bad in Matlab is that you
can rely on the elements of an array to be scalar.

Matlab, for its part, does contain some types (cell arrays and
structures) that can contain non-scalars for cases when you want to do
that.  However, you will notice that no scalar will ever be treated as
a degenerate cell array or structure.  Even a poorly designed, ad hoc
language like Matlab won't try to treat scalar numbers as cell arrays,
because ambiguity like I described above will appear.  Nope, it takes
an abomination like Perl to do that.

As for how I would recommend you translate the Matlab function that
rely on this: you need to decide whether that scalar number you are
passing in is REALLY a scalar, or a degenerate vector.  If it's the
latter, just get over it and pass in a single-item list.  I have done
these kinds of conversions myself before, and that's the way to do
it.  Trying to be cute and writing functions that can work with both
types will lead to trouble and inconsistency, especially if you are a
Python newbie.  Just pass in a list if your function expects a list.

Carl Banks