Wrong default endianess in utf-16 and utf-32 !?
John Machin
sjmachin at lexicon.net
Tue Oct 12 16:00:40 EDT 2010
jmfauth <wxjmfauth <at> gmail.com> writes:
> When an endianess is not specified, (BE, LE, unmarked forms),
> the Unicode Consortium specifies, the default byte serialization
> should be big-endian.
>
> See http://www.unicode.org/faq//utf_bom.html
> Q: Which of the UTFs do I need to support?
> and
> Q: Why do some of the UTFs have a BE or LE in their label,
> such as UTF-16LE?
Sometimes it is necessary to read right to the end of an answer:
Q: Why do some of the UTFs have a BE or LE in their label, such as UTF-16LE?
A: [snip] the unmarked form uses big-endian byte serialization by default, but
may include a byte order mark at the beginning to indicate the actual byte
serialization used.
More information about the Python-list
mailing list