Starting point for unicode conversion

Fri Sep 5 01:22:24 EDT 2003

Howard Lightstone <howard at eegsoftware.com> writes:

> The key point here is Tkinter.  I believe (from reading this list) that I 
> can expect that SOME returned text may be Unicode (depending on content and 
> Windows locale settings).

Yes, and no. Yes, some returned text may be Unicode, but no, it won't
depend on the locale settings. Instead, Tkinter will return a byte
string if the result contains only ASCII characters, and return a
Unicode string if there are non-ASCII characters.

> Would it be best to just (somehow) force all text into Unicode or would it 
> be "better" to handle specific instances?

If you are prepared to deal with Unicode, it would be best to force
that throughout. I was contemplating to make this an option in
_tkinter, but that has not been implemented - contributions are
welcome.

Meanwhile, you can use 

  s = unicode(s)

on all strings returned from Tkinter: if s is an ASCII string, the
default encoding should happily convert it to a Unicode object; if s
is a Unicode string, unicode(s) will be a no-op.

> I also have the problem of embedded text in data files I create that I have 
> to store as *something* that I can fully recover and convert back to 
> something reasonable even if the locale changes.

Don't worry about the locale; it does not matter here.

Regards,
Martin