[issue21118] str.translate is absurdly slow in majority of use cases (takes up to 60x longer than similar functions)
Serhiy Storchaka
report at bugs.python.org
Sat Apr 5 15:57:16 CEST 2014
Serhiy Storchaka added the comment:
субота, 05-кві-2014 12:49:22 ви написали:
> STINNER Victor added the comment:
>
> Serhiy wrote:
> > fast_translate.patch works only with ASCII input string and ASCII 1:1
> > mapping. Is this actually typical case?
> I just checked the Python stdlib: as expected, all usages of
> str.translate() except of email.quoprimime use ASCII 1:1.
Because str.translate() is much slower than a series of str.replace() (which
already is optimized), some usages of str.translate() was rewritten to use
str.replace(). See for example html.escape(). This is about what this issue.
> My
> optimization is only used if the input string is ASCII, but I expect
> that most strings are just ASCI.
In most (if not all) these cases input string can be non-ASCII.
> bench_translate.py: benchmark ASCII 1:1 but also ASCII 1:1 with deletion.
Could you please provide bench_translate.py?
> It will probably require more complex "cache". You may take a look at
> charmap codec which has such more complex cache (cache with 3 levels), see
> my message msg215301.
I were going to do this on next step. Full cache can grow up to 1114112
characters, so I planned to cache only BMP characters (cache with 2 levels).
You commit too fast, I am late for you. ;)
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21118>
_______________________________________
More information about the Python-bugs-list
mailing list