[Python-ideas] Create a StringBuilder class and use it everywhere
Steven D'Aprano
steve at pearwood.info
Thu Aug 25 15:57:14 CEST 2011
Carl Matthew Johnson wrote:
> Interesting semantics…
>
>
> What version of Python were you using? The current documentation has this to say:
>
> • CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use thestr.join() method which assures consistent linear concatenation performance across versions and implementations.
>
> Changed in version 2.4: Formerly, string concatenation never occurred in-place.
>
> <http://docs.python.org/library/stdtypes.html>
>
> It's my understanding that the naïve approach should now have performance comparable to the "proper" list append technique as long as you use CPython >2.4.
Relying on that is a bad idea. It is not portable from CPython to any
other Python (none of IronPython, Jython or PyPy can include that
optimization), it also depends on details of the memory manager used by
your operating system (what is fast on one computer can be slow on
another), and it doesn't even work under all circumstances (it relies on
the string having exactly one reference as well as the exact form of the
concatenation).
Here's a real-world example of how the idiom of repeated string
concatenation goes bad:
http://www.mail-archive.com/pypy-dev@python.org/msg00682.html
Here's another example, from a few years back, where part of the
standard library using string concatenation was *extremely* slow under
Windows. Linux users saw no slowdown and it was very hard to diagnose
the problem:
http://www.mail-archive.com/python-dev@python.org/msg40692.html
--
Steven
More information about the Python-ideas
mailing list