tkinter + unicode + bug or feature??

Martin v. Löwis martin at v.loewis.de
Sun Jan 26 03:31:54 EST 2003


Bob van der Poel <bvdpoel at kootenay.com> writes:

> Well, since I work mostly in plain ascii, not unicode, I would think
> the correct behaviour would be to return a regular string <type
> 'str'>. 

That can't work, since you can enter text that cannot be represented
in such a regular string.

> I think this is what it was in tcl/tk pre-8.1. And if the
> community wanted to have unicode, that would be fine as well. But, the
> way it is now one never knows if one is going to get a <type
> 'unicode'> or a 'str'. And that isn't right, is it?

This is for backwards compatibility. Always returning Unicode would
have broken too many applications.

> If my program takes strings entered by a user in a Entry() widget and
> I take that data, convert it from a possible unicode string to the
> user's current locale, will the result always be a regular string?

No, the conversion can also raise an exception. If the conversion does
not raise an exception, the result is a byte string.

> Really, what I'm trying to do is to avoid having my program crash when
> I do something like:
> 
>      a=entrywidget.get()
>      if a == somestring:
>         .....
> 
> Current, 'somestring' IS a regular string. 

That, in itself, is not a problem: It appears that it is a byte string
that has non-ASCII bytes in it.

> And if 'a' is a unicode the
> program aborts. So, I'm planning on replacing get() with myget() which
> will just do:
> 
>      a=widget.get().encode(userEncoding)

That can still "abort" (i.e. raise an exception).

> And we're sure there isn't a tcl/tk setting to take of this???

Yes. Tcl 8.x represents all text data internally in UTF-8. So Python
has the choice of either returning UTF-8 byte strings, or returning
Unicode strings.

Regards,
Martin




More information about the Python-list mailing list