[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

STINNER Victor report at bugs.python.org
Thu Oct 8 19:04:15 EDT 2015


STINNER Victor added the comment:

Oh, I was surprised to see same or worse performances for UTF-8/backslashreplace. In fact, I forgot to enable overallocation. With overallocation, it is now faster ;-)

I modified the API to put the "stack buffer" inside _PyBytesWriter API directly. I also reworked _PyBytesWriter_Alloc() to call  _PyBytesWriter_Prepare() so _PyBytesWriter_Alloc() now supports overallocation as well. It was part of _PyBytesWriter design to support overallocation at the first allocation (_PyBytesWriter_Alloc), that's why we have _PyBytesWriter_Alloc() *and* _PyBytesWriter_Init(): it's possible to set overallocate=1 between init and alloc.

I pushed my change since it didn't kill performances. It's only a little bit smaller but on very short encode: less than 500 ns. In other cases, it's the same performances or faster.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25318>
_______________________________________


More information about the Python-bugs-list mailing list