[issue21118] str.translate is absurdly slow in majority of use cases (takes up to 60x longer than similar functions)

Sat Apr 5 14:49:22 CEST 2014

STINNER Victor added the comment:

Serhiy wrote:
> fast_translate.patch works only with ASCII input string and ASCII 1:1 mapping. Is this actually typical case?

I just checked the Python stdlib: as expected, all usages of
str.translate() except of email.quoprimime use ASCII 1:1. My
optimization is only used if the input string is ASCII, but I expect
that most strings are just ASCI.

** distutils: ASCII => ASCII (1:1)

   longopt_xlate = str.maketrans('-', '_')
and
   WS_TRANS = {ord(_wschar) : ' ' for _wschar in string.whitespace};
.. text = text.translate(WS_TRANS)

** email.quoprimes:

    encoded = header_bytes.decode('latin1').translate(_QUOPRI_HEADER_MAP)
and
    body = body.translate(_QUOPRI_BODY_ENCODE_MAP)

=> my optimization is used if the input string contains "safe header
characters" (a-z, A-Z, 0-9, space and "-!*+/"). It should be the
common case for emails.

** rot13 encoding: ASCII 1:1

** idlelib.PyParse: ASCII 1:1

        str = str.translate(_tran)

** textwrap: ASCII 1:1

            text = text.translate(self.unicode_whitespace_trans)

** zipfile: ASCII 1:1

        arcname = arcname.translate(table)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21118>
_______________________________________