[issue27458] Allow subtypes of unicode/str to hit the optimized unicode_concatenate block

Ammar Askar report at bugs.python.org
Wed Jul 6 15:20:35 EDT 2016


Ammar Askar added the comment:

> We really don't want to encourage any reliance on this optimization.  It was put there only to help mitigate the performance impact of a common mistake.

Aah, I didn't realize the extra context behind why the unicode_concatenate path actually exists in ceval. Makes sense though since it was the only exceptional case that popped out in ceval.

> It would be interesting to see an example showing the benefit of this change

I'm no expert at benchmarking but I threw together a quick script to compare the different ways of string concatenation and it seems like an INPLACE_ADD is the fastest. (which makes sense because it avoids the overhead of having a list object that ''.join brings in, not sure why StringIO is slower than both though, maybe that would be a better place to improve?)

Either way if you guys think this adds too much complexity on top of an existing hack, this is fine to close.


Benchmarking results:

  INPLACE_ADD short ascii
0.4307783489348367
  ''.join     short ascii
0.6934443039353937
  StringIO    short ascii
0.9447220619767904

  INPLACE_ADD short unicode
0.4411839219974354
  ''.join     short unicode
0.666951927007176
  StringIO    short unicode
0.9783720930572599

  INPLACE_ADD long ascii
3.6157665309729055
  ''.join     long ascii
6.938268916099332
  StringIO    long ascii
5.279585674987175

  INPLACE_ADD long unicode
3.7768358619650826
  ''.join     long unicode
4.641092017991468
  StringIO    long unicode
7.6051657549105585

----------
Added file: http://bugs.python.org/file43644/bench.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27458>
_______________________________________


More information about the Python-bugs-list mailing list