A Python 3000 Question

Wed Oct 31 19:02:58 EDT 2007

On Oct 31, 6:13 pm, Steven D'Aprano

> What you have measured is a local optimization that is only useful when
> you have a tight loop with lots of calls to the same len():
>
> Len = sequence.__len__
> while Len() < 100000:
>     foo(sequence)

Exactly what timeit() does, a tight loop.

> But what Neil is measuring is relevant for the more general case of
> calling len() on arbitrary objects at arbitrary times:
>
> x = len(sequence)
> y = 42 + len("xyz")
> if len(records) == 12:
>     print "twelve!"
>
> To get the benefit of the micro-optimization you suggested, you would
> need to write this:
>
> Len = sequence.__len__
> x = Len()
> Len = "xyz".__len__
> y = 42 + Len()
> Len = records.__len__
> if Len() == 12:
>     print "twelve!"
>
> Not only is that an especially ugly, confusing piece of code, it is also
> counter-productive as you still have to do the method lookup every single
> time.

The whole point of the optimization is avoiding 99999 attribute
lookups of the timeit loop. For a single call both times are
practically zero so the snippet above is just silly.

> Taking the method lookup is an apple-and-oranges comparison. It is useful
> to know for the times when you don't want an apple, you want an orange
> (i.e. the tight loop example above).
>
> This is an apples-and-apples comparison:
>
> >>> import timeit
> >>> timeit.Timer("Len = seq.__len__; Len()", "seq = range(12)").repeat()
>
> [1.3109469413757324, 0.7687380313873291, 0.55280089378356934]>>> timeit.Timer("len(seq)", "seq = range(12)").repeat()
>
> [0.60111594200134277, 0.5977931022644043, 0.59935498237609863]
>
> The time to do the lookup and call the method is about 0.6 microseconds
> whether you do it manually or let len() do it.

That assumes you know the internals of len(). Perhaps it does
something smarter for lists, or perhaps attribute lookup is faster at
the C level. I don't know, and that's exactly the point, I don't need
to know the internals to do the comparison of two callables.

> The logic behind the micro-optimization trick is to recognize when you
> can avoid the cost of repeated method lookups by doing it once only. That
> is not relevant to the question of whether len() should be a function or
> a method.

No disagreement here; I didn't bring up performance and it's not at
all a reason I consider len() as a function to be a design mistake.
What I dispute is Neil's assertion that "len as a builtin can be
faster than len as an attribute" and bringing as evidence the timings
that include attribute lookups. That statement is just wrong;
comparing X to an attribute assumes you have the attribute in the
first place, you don't need to look it up.

George