ASCII versus non-ASCII [was Re: flaming vs accuracy [was Re: Performance of int/long in Python 3]]

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Mar 31 04:22:48 EDT 2013


On Sun, 31 Mar 2013 00:35:23 -0700, jmfauth wrote:


> This is not really the problem. "Serious users" may notice sooner or
> later, Python and Unicode are walking in opposite directions
> (technically and in spirit).
> 
>>>> timeit.repeat("'a' * 1000 + 'ẞ'")
> [1.1088995672090292, 1.0842266613261913, 1.1010779011941594]
>>>> timeit.repeat("'a' * 1000 + 'z'")
> [0.6362570846925735, 0.6159128762502917, 0.6200501673623791]

Perhaps you should stick to Python 3.2, where ASCII strings are no faster 
than non-ASCII strings.


Python 3.2 versus Python 3.3, no significant difference:

# 3.2
py> timeit.repeat("'a' * 1000 + 'ẞ'")
[1.7418999671936035, 1.7198870182037354, 1.763346004486084]

# 3.3
py> timeit.repeat("'a' * 1000 + 'ẞ'")
[1.8083378580026329, 1.818592812011484, 1.7922867869958282]



Python 3.2, ASCII vs Non-ASCII:

py> timeit.repeat("'a' * 1000 + 'z'")
[1.756322135925293, 1.8002049922943115, 1.721085958480835]
py> timeit.repeat("'a' * 1000 + 'ẞ'")
[1.7209150791168213, 1.7162668704986572, 1.7260780334472656]



In other words, if you stick to non-ASCII strings, Python 3.3 is no 
slower than Python 3.2.



-- 
Steven



More information about the Python-list mailing list