Python 3.0 crashes displaying Unicode at interactive prompt

Vlastimil Brom vlastimil.brom at gmail.com
Sat Dec 13 16:03:14 EST 2008


2008/12/13 John Machin <sjmachin at lexicon.net>:
>
> Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> x = u'\u9876'
>>>> x
> u'\u9876'
>
> # As expected
>
> Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit
> (Intel)] on win 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> x = '\u9876'
>>>> x
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "C:\python30\lib\io.py", line 1491, in write
>    b = encoder.encode(s)
>  File "C:\python30\lib\encodings\cp850.py", line 19, in encode
>    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\u9876' in
> position
> 1: character maps to <undefined>
>
> # *NOT* as expected (by me, that is)
>
> Is this the intended outcome?
> --
> http://mail.python.org/mailman/listinfo/python-list
>

I also found this a bit surprising, but it seems to be the intended
behaviour (on a non-unicode console)

http://docs.python.org/3.0/whatsnew/3.0.html
"PEP 3138: The repr() of a string no longer escapes non-ASCII
characters. It still escapes control characters and code points with
non-printable status in the Unicode standard, however."

I get the same error in windows cmd, (Idle prints the respective glyph
correctly).
To get the old behaviour of repr, one can use ascii, I suppose.

Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.

>>> repr('\u9876')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python30\lib\io.py", line 1491, in write
    b = encoder.encode(s)
  File "C:\Python30\lib\encodings\cp852.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u9876' in position
2: character maps to <undefined>
>>> '\u9876'.encode("unicode-escape")
b'\\u9876'
>>> ascii('\u9876')
"'\\u9876'"
>>>



More information about the Python-list mailing list