[Python-Dev] Internationalization Toolkit

Da Silva, Mike Mike.Da.Silva@uk.fid-intl.com
Fri, 12 Nov 1999 11:43:21 -0000


Fredrik Lundh wrote:

> 5. UTF-16 requires string operations that do not make assumptions about
> nulls - this means re-implementing most of the C runtime functions to work
> with unsigned shorts.

footnote: the mad scientist has been there and done that:
http://www.pythonware.com/madscientist/
<http://www.pythonware.com/madscientist/> 
(and you can replace "unsigned short" with "whatever's suitable on this
platform")

Surely using a different type on different platforms means that we throw
away the concept of a platform independent Unicode string?
I.e. on Solaris, wchar_t is 32 bits, on Windows it is 16 bits.
Does this mean that to transfer a file between a Windows box and Solaris, an
implicit conversion has to be done to go from 16 bits to 32 bits (and vice
versa)?  What about byte ordering issues?
Or do you mean whatever 16 bit data type is available on the platform, with
a standard (platform independent) byte ordering maintained?
Mike da S