[issue37348] Optimize PyUnicode_GetString for short ASCII strings

STINNER Victor report at bugs.python.org
Thu Jun 20 12:33:54 EDT 2019


STINNER Victor <vstinner at redhat.com> added the comment:

> _PyUnicode_FromASCII(s, len) is faster than PyUnicode_FromString(s) because PyUnicode_FromString() uses temporary _PyUnicodeWriter to support UTF-8.

I don't understand how _PyUnicodeWriter could be slow. It does not overallocate by default. It's just wrapper to implement efficient memory management.

> Oh, wait.  Why we used _PyUnicodeWriter here?

To optimize decoding errors: the error handler can use replacement string longer than 1 character. Overallocation is used in this case.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37348>
_______________________________________


More information about the Python-bugs-list mailing list