[docs] [issue37512] Error in the documentation about string concatenation

Steven D'Aprano report at bugs.python.org
Fri Jul 5 21:14:12 EDT 2019


Steven D'Aprano <steve+python at pearwood.info> added the comment:

Eric is correct that this is a CPython optimization, it is not a language feature.

Furthermore, it is an optimization that can be affected by rather subtle factors such as the operating system memory management. Here is a thread demonstrating that code that relied on this optimization ran fast on Linux and slow on Windows, literally hundreds of times slower than other tools:

https://mail.python.org/archives/list/python-dev@python.org/thread/EQO6HDZXHFCP4DBEE5Q7MYLHPJMGJ7DQ/

If you prefer the old mailman archives, here are a few relevant posts:

https://mail.python.org/pipermail/python-dev/2009-August/091125.html

https://mail.python.org/pipermail/python-dev/2009-September/091582.html

https://mail.python.org/pipermail/python-dev/2009-September/091592.html

Best practice is to *not* rely on this optimization, but to use str.join. That ensures that:

- your code remains fast if run on alternate implementations;
- your code remains fast if the OS memory management changes;
- your code remains fast if you change your code slightly.

For example, I expect both of the following to show the documented quadratic behaviour:

- Prepending instead of appending

- keeping two or more references to the string being appended to

  partials = []
  current = ''
  for s in list_of_strings:
      partials.append(current)
      current += s

There may be others.

Perhaps we ought to document this optimization specifically so we can suggest that it is intended for improving performance of casual scripts, and that *best practice* for professional quality code remains str.join.

----------
nosy: +steven.daprano

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37512>
_______________________________________


More information about the docs mailing list