Python 3.0b2 cannot map '\u12b'

Terry Reedy tjreedy at udel.edu
Mon Sep 1 14:25:01 EDT 2008



Marc 'BlackJack' Rintsch wrote:
> On Mon, 01 Sep 2008 02:27:54 -0400, Terry Reedy wrote:
> 
>> I doubt the OP 'chose' cp437.  Why does Python using cp437 even when the
>> default encoding is utf-8?
>>
>> On WinXP
>>  >>> sys.getdefaultencoding()
>> 'utf-8'
>>  >>> s='\u012b'
>>  >>> s
>> Traceback (most recent call last):
>>    File "<stdin>", line 1, in <module>
>>    File "C:\Program Files\Python30\lib\io.py", line 1428, in write
>>      b = encoder.encode(s)
>>    File "C:\Program Files\Python30\lib\encodings\cp437.py", line 19, in
>> encode
>>      return codecs.charmap_encode(input,self.errors,encoding_map)[0]
>> UnicodeEncodeError: 'charmap' codec can't encode character '\u012b' in
>> position
>> 1: character maps to <undefined>
> 
> Most likely because Python figured out that the terminal expects cp437.  
> What does `sys.stdout.encoding` say?

The interpreter in the command prompt window says CP437.
The IDLE Window says 'cp1252', and it handles the character fine.
Given that Windows OS can handle the character, why is Python/Command 
Prompt limiting output?

Characters the IDLE window cannot display (like surrogate pairs) it 
displays as boxes.  But if I cut '[][]' (4 chars) and paste into 
Firefox, I get 3 chars. '[]' where [] has some digits instead of being 
empty.  It is really confusing when every window on 'unicode-based' 
Windows handles a different subset.  Is this the fault of Windows or of 
Python and IDLE (those two being more limited that FireFox)?

>> To put it another way, how can one 'choose' utf-8 for display to screen?
> 
> If the terminal expects cp437 then displaying utf-8 might give some 
> problems.

My screen displays whatever Windows tells the graphics card to tell the 
screen to display.  In OpenOffice, I can select a unicode font that 
displays at least everything in the BasicMultilingualPlane (BMP).

Terry Jan Reedy




More information about the Python-list mailing list