sum and strings

Steven D'Aprano steve at REMOVEME.cybersource.com.au
Mon Aug 21 22:13:13 EDT 2006


On Mon, 21 Aug 2006 12:55:14 +0200, Fredrik Lundh wrote:

> Steven D'Aprano wrote:
> 
>> That's a nonsense argument. There is no shortage of slow algorithms, and
>> some of them are built into Python. When did O(n**2) become an error
>> condition? And overhead matters: if I'm only doing a few concatenations,
>> it is significantly faster to add the strings using + than to use join()
>> (at least in Python 2.3 -- YMMV):
>> 
>>>>> s = timeit.Timer("''.join(L)", "L=['a'*50, 'b'*50, 'c'*50]")
>>>>> s.timeit()
>> 1.3470098972320557
>>>>> t = timeit.Timer("a+b+c", "a,b,c = 'a'*50, 'b'*50, 'c'*50")
>>>>> t.timeit()
>> 1.0698421001434326
> 
> and what exactly does the fact that Python can do operator-based 
> dispatch much faster than it can do method-based dispatch have to
> do with sum's inability to add strings ?

Sorry, I don't get you here. Are you saying that string addition doesn't
call the operands' __add__ methods?

Obviously I would have preferred to have done a direct test of sum()
versus join(), but I can't since sum() goes out of its way to make that
difficult.

Or maybe I can...

>>> setup0 = """class Magic:
...     def __add__(self, other): return other
...
... L = ['a', 'b']
... M = Magic()
... """
>>>
>>> t5 = timeit.Timer("sum(L, M)", setup)
>>> t5.timeit()
12.319401025772095

That's bizarrely slow for concatenating two characters. It's an order of
magnitude slower than doing a direct addition of two characters, and about
one third the speed of a pure Python implementation:

>>> setup1 = """def mysum(strlist):
...     t = ""
...     for s in strlist: t = t+s
...     return t
...
... a, b, c, d, = 'a', 'b', 'c', 'd'
... """
>>>
>>> t6 = timeit.Timer("mysum([a,b,c,d])", setup1)
>>> t6.timeit()
4.3943448066711426

What's going on here? What have I missed? I know adding strings is
O(n**2), but these results don't make any sense to me.



-- 
Steven D'Aprano 




More information about the Python-list mailing list