iso_8859_1 mystery/tkinter
phil
phillip.watts at anvilcom.com
Wed May 18 21:23:57 EDT 2005
Thanks.
My confusion was man 3 printf u is unsigned.
But of course that would be after %.
Never paid any attention to unicode,
but will now.
I'll figure out on my own why Tkinter and
WinXP console treated differently or used different
codepage. Thanks.
Jeff Epler wrote:
> this isn't about the "sign bit", it's about assumed encodings for byte
> strings..
>
> In iso_8859_1 and unicode, the character with value 0xb0 is DEGREE SIGN.
> In other character sets, that may not be true---For instance, in the
> Windows "code page 437", it is u'\u2591' aka LIGHT SHADE (a half-tone pattern).
>
> When you write code like
> x = '%c' % (0xb0)
> and then pass x to a Tkinter call, Tkinter treats it as a string encoded
> in some system-default encoding, which could give DEGREE SIGN, could
> give LIGHT SHADE, or could give other characters (a thai user of Windows
> might see THAI CHARACTER THO THAN, for instance, and I would see a
> question mark because I use utf-8 and this is an invalid byte sequence).
>
> By using
> x = u'%c' % (0xb0)
> you get a unicode string, and there is no confusion about the meaning of
> the symbol---you always get DEGREE SIGN.
>
> Jeff
>
More information about the Python-list
mailing list