Multibyte Character Surport for Python
Martin v. Loewis
martin at v.loewis.de
Sat May 11 09:34:47 EDT 2002
"Stephen J. Turnbull" <stephen at xemacs.org> writes:
> Martin> That's how UTF-16 is specified.
>
> The Unicode standard permits, but does not require, a BOM.
Factually, the Unicode standard does not recognize UTF-16 as a byte
encoding; it only recognizes it as a CEF, not as a CES (see TR#17).
UTF-16 as-a-CES is defined in RFC 2781, which, in section 3.3, says
that the BOM SHOULD be inserted if the CES UTF-16 is used.
Regards,
Martin
More information about the Python-list
mailing list