[Python-ideas] Create a StringBuilder class and use it everywhere

Steven D'Aprano steve at pearwood.info
Thu Aug 25 15:57:14 CEST 2011


Carl Matthew Johnson wrote:
> Interesting semantics…
> 
> 
> What version of Python were you using? The current documentation has this to say:
> 
> 	• CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use thestr.join() method which assures consistent linear concatenation performance across versions and implementations.
> 
> Changed in version 2.4: Formerly, string concatenation never occurred in-place.
> 
> <http://docs.python.org/library/stdtypes.html>
> 
> It's my understanding that the naïve approach should now have performance comparable to the "proper" list append technique as long as you use CPython >2.4.


Relying on that is a bad idea. It is not portable from CPython to any 
other Python (none of IronPython, Jython or PyPy can include that 
optimization), it also depends on details of the memory manager used by 
your operating system (what is fast on one computer can be slow on 
another), and it doesn't even work under all circumstances (it relies on 
the string having exactly one reference as well as the exact form of the 
concatenation).


Here's a real-world example of how the idiom of repeated string 
concatenation goes bad:

http://www.mail-archive.com/pypy-dev@python.org/msg00682.html

Here's another example, from a few years back, where part of the 
standard library using string concatenation was *extremely* slow under 
Windows. Linux users saw no slowdown and it was very hard to diagnose 
the problem:

http://www.mail-archive.com/python-dev@python.org/msg40692.html


-- 
Steven



More information about the Python-ideas mailing list