"More About Unicode in Python 2 and 3"
Chris Angelico
rosuav at gmail.com
Wed Jan 8 18:45:37 EST 2014
On Thu, Jan 9, 2014 at 10:34 AM, <rdsteph at mac.com> wrote:
> I just meant to say that internet programming using ASCII urls is so common and important that it hurts that Python 3 makes it so much harder. It sure would be great if Python 3 could be improved to allow such programming to be done using ASCII urls without requiring all the unicode overhead.
>
> Armin is right. Calling his post a rant doesn't help.
There's one big problem with that theory. We've been looking, on this
list and on python-ideas, at some practical suggestions for adding
something to Py3 that will help. So far, lots of people have suggested
things, and the complainers haven't attempted to explain what they
actually need. Hard facts and examples would help enormously.
Incidentally, before referring to "all the Unicode overhead", it would
help to actually measure the overhead of encoding and decoding.
Python 2.7:
>>> timeit.timeit("a.encode().decode()","a=u'a'*1000",number=500000)
8.787162614242874
Python 3.4:
>>> timeit.timeit("a.encode().decode()","a=u'a'*1000",number=500000)
1.7354552045022515
Since 3.3, the cost of UTF-8 encoding/decoding an all-ASCII string is
extremely low. So the real cost isn't in run-time performance but in
code complexity. Would it be easier to work with ASCII URLs with a
one-letter-name helper function? I never got an answer to that
question.
ChrisA
More information about the Python-list
mailing list