iso_8859_1 mystery/tkinter

phil phillip.watts at anvilcom.com
Wed May 18 21:23:57 EDT 2005


Thanks.
My confusion was man 3 printf  u is unsigned.
But of course that would be after %.
Never paid any attention to unicode,
but will now.


I'll figure out on my own why Tkinter and
WinXP console treated differently or used different
codepage.  Thanks.

Jeff Epler wrote:

> this isn't about the "sign bit", it's about assumed encodings for byte
> strings..
> 
> In iso_8859_1 and unicode, the character with value 0xb0 is DEGREE SIGN.
> In other character sets, that may not be true---For instance, in the
> Windows "code page 437", it is u'\u2591' aka LIGHT SHADE (a half-tone pattern).
> 
> When you write code like
>     x = '%c' % (0xb0)
> and then pass x to a Tkinter call, Tkinter treats it as a string encoded
> in some system-default encoding, which could give DEGREE SIGN, could
> give LIGHT SHADE, or could give other characters (a thai user of Windows
> might see THAI CHARACTER THO THAN, for instance, and I would see a
> question mark because I use utf-8 and this is an invalid byte sequence).
> 
> By using
>     x = u'%c' % (0xb0)
> you get a unicode string, and there is no confusion about the meaning of
> the symbol---you always get DEGREE SIGN.
> 
> Jeff
> 






More information about the Python-list mailing list