Unicode error in exefied Python script

Mon Jan 20 11:51:36 EST 2003

Tim Daneliuk <tundra at tundraware.com> writes:

> Exactly right - So when concatenating strings like I did above,
> Python always promotes the result to the 'higher' type?

Where 'higher' type means the two of the types which is the superset
of the other, in some sense - yes. It is, however, questionable which
of the types is supertype, here, since there are byte strings which
cannot be converted to Unicode, and Unicode strings which cannot be
converted to byte strings.

So for this situation, a special exception from this "always" rule is
made: byte strings are promoted to Unicode strings when the two are
combined.

> This too appears to work as you suggest.  Is there some portable way
> to determine the codepage in use on Win32 and posix systems?

Not really. For Windows only, locale.getlocale()[1] will give that
information. For Unix, this sometimes works, sometimes it
doesn't. Python 2.3 will provide locale.getpreferredencoding, which
should return the user's preferred encoding uniformly.

It will also provide better processing of Unicode file names on
Windows, so that you not need to ever use byte strings as file names.

Regards,
Martin