[Python-Dev] Some thoughts on the codecs...

Greg Stein gstein@lyra.org
Tue, 16 Nov 1999 03:45:48 -0800 (PST)


On Tue, 16 Nov 1999, Fredrik Lundh wrote:
>...
> since this is already very close, maybe we could adopt
> the naming guidelines from XML:
> 
>     In an encoding declaration, the values "UTF-8", "UTF-16",
>     "ISO-10646-UCS-2", and "ISO-10646-UCS-4" should be used
>     for the various encodings and transformations of
>     Unicode/ISO/IEC 10646, the values "ISO-8859-1",
>     "ISO-8859-2", ... "ISO-8859-9" should be used for the parts
>     of ISO 8859, and the values "ISO-2022-JP", "Shift_JIS",
>     and "EUC-JP" should be used for the various encoded
>     forms of JIS X-0208-1997.
> 
>     XML processors may recognize other encodings; it is
>     recommended that character encodings registered
>     (as charsets) with the Internet Assigned Numbers
>     Authority [IANA], other than those just listed,
>     should be referred to using their registered names.
> 
>     Note that these registered names are defined to be
>     case-insensitive, so processors wishing to match
>     against them should do so in a case-insensitive way.
> 
> (ie "iso-8859-1" instead of "latin-1", etc -- at least as
> aliases...).

+1

(as we'd say in Apache-land... :-)

-g

--
Greg Stein, http://www.lyra.org/