String concatenation performance with +=

Steven D'Aprano steve at pearwood.info
Sat Feb 14 00:47:57 EST 2009


Benjamin Peterson wrote:

> Sammo <sammo2828 <at> gmail.com> writes:
> 
>> String concatenation has been optimized since 2.3, so using += should
>> be fairly fast.
> 
> This is implementation dependent and shouldn't be relied upon.

It's also a fairly simple optimization and really only applies to direct
object access, not items or attributes.

>>> Timer('s += "x"', 's = ""').repeat(number=100000)
[0.067316055297851562, 0.063985109329223633, 0.066659212112426758]

>>> Timer('s[0] += "x"', 's = [""]').repeat(number=100000)
[3.0495560169219971, 2.2938292026519775, 2.2914319038391113]

>>> Timer('s.s += "x"', 
... 'class K(object): pass \ns=K();s.s = ""').repeat(number=100000) 
[3.3624241352081299, 2.3346412181854248, 2.9846079349517822]



>> Note that I need to do something to mydata INSIDE the loop, so please
>> don't tell me to append moredata to a list and then use "".join after
>> the loop.
> 
> Then why not just mutate the list and then call "".join?

Yes, there almost certainly is a way to avoid the repeated concatenation.


>> Why is the second test so much slower?
> 
> Probably several reasons:
> 
> 1. Function call overhead is quite large compared to these simple
> operations. 2. You are resolving attribute names.

3. Because the optimization isn't applied in this case.


-- 
Steven




More information about the Python-list mailing list