[Python-Dev] Fwd: summing a bunch of numbers (or "whatevers")

Alex Martelli aleax@aleax.it
Tue, 22 Apr 2003 12:54:10 +0200


On Tuesday 22 April 2003 05:03 am, Tim Peters wrote:
   ...
> filter() is hard to get rid of because the bizarre filter(None, seq)
> special case is supernaturally fast.  Indeed, time the above against
>
> def alltrue(seq):
>     return len(filter(None, seq)) == len(seq)
>
> def atleastonetrue(seq):
>     return bool(filter(None, seq))
>
> Let me know which wins <wink>.

Hmmm, I think I must be missing something here.  Surely in many
application cases a loop exploiting short-circuiting behavior will have
better expected performance than anything that's going all the way
through the sequence no matter what?  Far greater variance, sure,
and if the probability of true items gets extreme enough then the
gain from short-circuiting will evaporate, but...:

[alex@lancelot src]$ ./python Lib/timeit.py -s'seq=[i%2 for i in range(9999)]' 
-s'''
> def any(x):
>   for xx in x:
>     if xx: return True
>   return False
> ''' 'any(seq)'
1000000 loops, best of 3: 1.42 usec per loop

[alex@lancelot src]$ ./python Lib/timeit.py -s'seq=[i%2 for i in range(9999)]' 
-s'''
def any(x):
  return bool(filter(None,x))
''' 'any(seq)'
1000 loops, best of 3: 679 usec per loop

...i.e., despite filter's amazing performance, looping over 10k items still
takes a bit more than shortcircuiting out at once;-).

If Python ever gains such C-coded functions as any, all, etc (hopefully in
some library module, not in builtins!) I do hope and imagine they'd short-
circuit, of course.  BTW, I think any should return the first true item (or 
the last one if all false, or False for an empty sequence) and all should
return the first false item (or the last one if all true, or True for an empty 
seq) by analogy with the behavior of operators and/or.


Alex