sum and strings

Georg Brandl g.brandl-nospam at gmx.net
Mon Aug 21 05:57:19 EDT 2006


Steven D'Aprano wrote:
> On Fri, 18 Aug 2006 19:08:37 -0700, Paul Rubin wrote:
> 
>> If the args are strings, the above is a quadratic time algorithm and
>> it's better to throw an error than create such a trap for an unwary user.
> 
> That's a nonsense argument. There is no shortage of slow algorithms, and
> some of them are built into Python. When did O(n**2) become an error
> condition? And overhead matters: if I'm only doing a few concatenations,
> it is significantly faster to add the strings using + than to use join()
> (at least in Python 2.3 -- YMMV):
> 
>>>> s = timeit.Timer("''.join(L)", "L=['a'*50, 'b'*50, 'c'*50]")
>>>> s.timeit()
> 1.3470098972320557
>>>> t = timeit.Timer("a+b+c", "a,b,c = 'a'*50, 'b'*50, 'c'*50")
>>>> t.timeit()
> 1.0698421001434326
> 
> There's a word for optimizations that actually result in code running 25%
> slower.
> 
> I applaud that Python's language developers care about efficiency. I
> applaud that the documentation warns people about traps and potential
> areas of inefficiency. But I think it is a shame that sum() does a
> special type-check to avoid something which is only sometimes slow. It
> doesn't protect against O(n**2) performance; it merely protects against
> just one of an infinite number of possible "traps for the unwary".
> 
> I would have thought it would be better for sum() to raise a warning, not
> an exception. If we take seriously the argument that sum implies
> addition, and that string concatenation isn't really addition, then sum()
> should also refuse to operate on lists and tuples and any other
> non-numeric class. Either would be better than sum() "protecting" the user
> from himself, except when it doesn't.

Well, present that on python-dev.

Georg



More information about the Python-list mailing list