[issue16334] Faster unicode-escape and raw-unicode-escape codecs

Fri Sep 2 07:38:39 EDT 2016

STINNER Victor added the comment:

Unicode escape encodecs were modified by the issue #25353 to use the _PyBytesWriter API. Sadly, I didn't benchmark my change before pushing it :-/

Your patch basically reverts my change.

> Py3.2        Py3.3        Py3.6        Py3.6+patch
> 195 (+136%)  109 (+323%)  258 (+79%)   461    encode  unicode-escape  'A'*10000
> 391 (+1310%) 333 (+1556%) 575 (+859%)  5514   encode  raw-unicode-escape  'A'*10000

I'm surprised that the revert makes raw-unicode-escape encoder so much faster. Does it mean that the _PyBytesWriter API is so inefficient?

The most efficient case for _PyBytesWriter is when you only call _PyBytesWriter_Alloc() and _PyBytesWriter_Finish() and the output string has exactly the allocated length. It should be the case when 'A'*10000 is encoded, no?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16334>
_______________________________________