How do I display unicode value stored in a string variable using ord()
Terry Reedy
tjreedy at udel.edu
Sun Aug 19 17:03:46 EDT 2012
On 8/19/2012 1:03 PM, Blind Anagram wrote:
> Running Python from a Windows command prompt, I got the following on
> Python 3.2.3 and 3.3 beta 2:
>
> python33\python" -m timeit "('abc' * 1000).replace('c', 'de')"
> 10000 loops, best of 3: 39.3 usec per loop
> python33\python" -m timeit "('ab…' * 1000).replace('…', '……')"
> 10000 loops, best of 3: 51.8 usec per loop
> python33\python" -m timeit "('ab…' * 1000).replace('…', 'x…')"
> 10000 loops, best of 3: 52 usec per loop
> python33\python" -m timeit "('ab…' * 1000).replace('…', 'œ…')"
> 10000 loops, best of 3: 50.3 usec per loop
> python33\python" -m timeit "('ab…' * 1000).replace('…', '€…')"
> 10000 loops, best of 3: 51.6 usec per loop
> python33\python" -m timeit "('XYZ' * 1000).replace('X', 'éç')"
> 10000 loops, best of 3: 38.3 usec per loop
> python33\python" -m timeit "('XYZ' * 1000).replace('Y', 'p?')"
> 10000 loops, best of 3: 50.3 usec per loop
>
> python32\python" -m timeit "('abc' * 1000).replace('c', 'de')"
> 10000 loops, best of 3: 24.5 usec per loop
> python32\python" -m timeit "('ab…' * 1000).replace('…', '……')"
> 10000 loops, best of 3: 24.7 usec per loop
> python32\python" -m timeit "('ab…' * 1000).replace('…', 'x…')"
> 10000 loops, best of 3: 24.8 usec per loop
> python32\python" -m timeit "('ab…' * 1000).replace('…', 'œ…')"
> 10000 loops, best of 3: 24 usec per loop
> python32\python" -m timeit "('ab…' * 1000).replace('…', '€…')"
> 10000 loops, best of 3: 24.1 usec per loop
> python32\python" -m timeit "('XYZ' * 1000).replace('X', 'éç')"
> 10000 loops, best of 3: 24.4 usec per loop
> python32\python" -m timeit "('XYZ' * 1000).replace('Y', 'p?')"
> 10000 loops, best of 3: 24.3 usec per loop
This is one test repeated 7 times with essentially irrelevant
variations. The difference is less on my system (50%). Others report
seeing 3.3 as faster. When I asked on pydev, the answer was don't bother
making a tracker issue unless I was personally interested in
investigating why search is relatively slow in 3.3 on Windows. Any
change would have to not slow other operations or severely impact search
on other systems. I suggest the same answer to you.
If you seriously want to compare old and new unicode, go to
http://hg.python.org/cpython/file/tip/Tools/stringbench/stringbench.py
and click raw to download. Run on 3.2 and 3.3, ignoring the bytes times.
Here is a version of the first comparison from stringbench:
print(timeit('''('NOW IS THE TIME FOR ALL GOOD PEOPLE TO COME TO THE AID
OF PYTHON'* 10).lower()'''))
Results are 5.6 for 3.2 and .8 for 3.3. WOW! 3.3 is 7 times faster!
OK, not fair. I cherry picked. The 7 times speedup in 3.3 likely is at
least partly independent of the 393 unicode change. The same test in
stringbench for bytes is twice as fast in 3.3 as 3.2, but only 2x, not
7x. In fact, it may have been the bytes/unicode comparison in 3.2 that
suggested that unicode case conversion of ascii chrs might be made faster.
The sum of the 3.3 unicode times is 109 versus 110 for 3.3 bytes and 125
for 3.2 unicode. This unweighted sum is not really fair since the raw
times vary by a factor of at least 100. But is does suggest that anyone
claiming that 3.3 unicode is overall 'slower' than 3.2 unicode has some
work to do.
There is also this. On my machine, the lowest bytes-time/unicode-time
for 3.3 is .71. This suggests that there is not a lot of fluff left in
the unicode code, and that not much is lost by the bytes to unicode
switch for strings.
--
Terry Jan Reedy
More information about the Python-list
mailing list