[issue17348] Unicode - encoding seems to be lost for inputs of unicode chars in IDLE

Mon Apr 22 05:17:42 CEST 2013

Tomoki Imai added the comment:

Sorry.I forgot to note my environment.

I'm using Arch Linux.
$ uname -a
Linux manaka 3.8.7-1-ARCH #1 SMP PREEMPT Sat Apr 13 09:01:47 CEST 2013 x86_64 GNU/Linux

And python version is here.
$ python --version
Python 2.7.4

IDLE's version is same, 2.7.4 downloaded from following link.
http://www.python.org/download/releases/2.7.4/

In IDLE,I repeated original author's attempts.

Python 2.7.4 (default, Apr  6 2013, 19:20:36)
[GCC 4.8.0] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> c = u'€'
>>> ord(c)

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    ord(c)
TypeError: ord() expected a character, but string of length 3 found
>>> c.encode('utf-8')
'\xc3\xa2\xc2\x82\xc2\xac'
>>> c
u'\xe2\x82\xac'
>>> print c
â‚¬
>>> c = u'\u20ac'
>>> ord(c)
8364
>>> c.encode('utf-8')
'\xe2\x82\xac'
>>> c
u'\u20ac'
>>> print c
€
>>>

I have a problem.But it is different from original.
After my fix.

Python 2.7.4 (default, Apr  6 2013, 19:20:36)
[GCC 4.8.0] on linux2
Type "copyright", "credits" or "license()" for more information.
>>> c = u'€'
>>> ord(c)
8364
>>> c.encode('utf-8')
'\xe2\x82\xac'
>>> c
u'\u20ac'
>>> print c
€
>>>

It works.

Using unicode escape is one solution.
But, we Japanese can type u'こんにちは' just in 10 or 5 key types.
And other people who use unicode literals for their language have same situation.
Why IDLE users (probably beginner) use such workaround ?

Of cource, using Python3 is best way.
All beginner should start from Python3 now.
But, there are people including me who have to use python2 because of libraries .

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17348>
_______________________________________