tkinter + unicode + bug or feature??

Fri Jan 24 22:37:57 EST 2003

Bob van der Poel wrote:
> BTW, I really think this is a bug. If you enter "ascii" text into the 
> entry box you get() returns a string, if you enter "extended ascii" you 
> get a unicode string. And since one can't tell beforehand what the user 
> is going to enter... Add to this the fact that the behaviour is not 
> documented in the tkinter reference manual (yes, it is the tcl/tk manual).

So what do you think the correct behaviour should be?

> Well, yes. Being on the US-side (altho I do live in Canada and we're a 
> bit less centric in our thinking) I was just referring to a "normal" 
> encoding...whatever that is :)

There is no such thing.

> Yes, local.getlocale() works fine. Now, if I do use encode on these 
> strings, will I run into problems if the user's locale is not encodable 
> into 8bits. Or can that not happen?

Depends on what you mean by "8bits". You might have meant to ask

Q. Could it happen that the user enters characters that cannot be 
represented in the 'normal encoding'?
A. Yes, this can happen. If you merely want to compare this to another 
byte string, you should decode that byte string to Unicode, and perform 
the comparison then.

or you meant to ask

Q. Could it happen that the encoding produces more than one byte per 
character.
A. Yes, this can happen, but it is no problem.

or you meant to ask

Q. Will Python support 'normal encodings' that produce more than one 
byte per character out of the box?
A. No, Python does not ship with any such codecs (*). You should install 
the JapaneseCodecs, KoreanCodecs, or ChineseCodecs package for that.

Regards,
Martin

(*) Except for UTF-8, UCS-2, UCS-4.