Py 3.3, unicode / upper()
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Thu Dec 20 00:51:48 EST 2012
On Thu, 20 Dec 2012 00:32:42 -0500, Terry Reedy wrote:
> In the unicode case, Jim discovered that find was several times slower
> in 3.3 than 3.2 and claimed that that was a reason to not use 3.2. I ran
> the complete stringbency.py and discovered that find (and consequently
> find and replace) are the only operations with such a slowdown. I also
> discovered that another at least as common operation, encoding strings
> that only contain ascii characters to ascii bytes for transmission, is
> several times as fast in 3.3. So I reported that unless one is only
> finding substrings in long strings, there is no reason to not upgrade to
> 3.3.
Yes, and if you remember, Jim (jfm) based his complaints on very possibly
the worst edge-case for the new Unicode implementation:
- generate a large string of characters
- replace every character in that string with another character
By memory:
s = "a"*100000
s = s.replace("a", "b")
or equivalent. Hardly representative of normal string processing, and
likely to be the worst-performing operation on new Unicode strings. And
yet even so, many people reported either a mild slow down or, in a few
cases, a small speed up.
--
Steven
More information about the Python-list
mailing list