[issue13621] Unicode performance regression in python3.3 vs python3.2

STINNER Victor report at bugs.python.org
Sat Dec 17 18:58:09 CET 2011


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

Sorted and grouped results. "replace", "find" and "concat" should be easy to fix, "format" is a little bit more complex, "strip" and "split" depends on "find" performance and require to scan the substring to ensure that the result is canonical (except if inputs are all ASCII, which is the case in these examples).

replace:

- "...text.with.2000.lines...replace("\n", " ") (*10): -37.668161%
- "...text.with.2000.lines...replace("\n", " ") (*10): -37.668161%

find:

- ("A"*1000).find("B") (*1000): -30.379747%
- "Andrew"+"Dalke" (*1000): -23.076923%- ("A"*1000).find("B") (*1000): -30.379747%

- "Andrew".startswith("A") (*1000): -20.588235%
- "Andrew".startswith("Anders") (*1000): -23.529412%
- "Andrew".startswith("A") (*1000): -20.588235%
- "Andrew".startswith("Anders") (*1000): -23.529412%

- "Andrew".endswith("w") (*1000): -23.529412%
- "Andrew".endswith("Andrew") (*1000): -22.857143%
- "Andrew".endswith("Anders") (*1000): -23.529412%
- "Andrew".endswith("w") (*1000): -23.529412%
- "Andrew".endswith("Andrew") (*1000): -22.857143%
- "Andrew".endswith("Anders") (*1000): -23.529412%

- "B" in "A"*1000 (*1000): -32.089552%
- "B" in "A"*1000 (*1000): -32.089552%

concat:

- "Andrew"+"Dalke" (*1000): -23.076923%

format:

- "The %(k1)s is %(k2)s the %(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000): -49.411765%
- "The %(k1)s is %(k2)s the %(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000): -49.411765%

strip:

- "\nHello!\n".strip() (*1000): -33.333333%
- "Hello!\n".strip() (*1000): -35.714286%
- "\nHello!".strip() (*1000): -28.571429%
- "\nHello!\n".strip() (*1000): -33.333333%
- "Hello!\n".strip() (*1000): -35.714286%
- "\nHello!".strip() (*1000): -28.571429%

- "Hello\t   \t".rstrip() (*1000): -33.333333%
- "\t   \tHello".rstrip() (*1000): -33.333333%
- "Hello!\n".rstrip() (*1000): -35.714286%
- "\nHello!".rstrip() (*1000): -35.714286%
- "Hello\t   \t".rstrip() (*1000): -33.333333%
- "\t   \tHello".rstrip() (*1000): -33.333333%
- "Hello!\n".rstrip() (*1000): -35.714286%
- "\nHello!".rstrip() (*1000): -35.714286%

split:

- dna.split("ACTAT") (*10): -21.066667%
- ("Here are some words. "*2).split() (*1000): -22.105263%
- "this\nis\na\ntest\n".split("\n") (*1000): -23.437500%
- "this--is--a--test--of--the--emergency--broadcast--system".split("--") (*1000): -22.429907%
- dna.split("ACTAT") (*10): -21.066667%
- ("Here are some words. "*2).split() (*1000): -22.105263%
- "this\nis\na\ntest\n".split("\n") (*1000): -23.437500%
- "this--is--a--test--of--the--emergency--broadcast--system".split("--") (*1000): -22.429907%

- "this\nis\na\ntest\n".rsplit("\n") (*1000): -23.437500%
- "this\nis\na\ntest\n".rsplit("\n") (*1000): -23.437500%

- ("A"*1000).rpartition("A") (*1000): -21.212121%
- ("A"*1000).rpartition("A") (*1000): -21.212121%

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13621>
_______________________________________


More information about the Python-bugs-list mailing list