Is this a bug? BOM decoded with UTF8
"Martin v. Löwis"
martin at v.loewis.de
Fri Feb 11 19:33:01 EST 2005
> What are you talking about? The BOM and UTF-16 go hand-and-hand. Without
> a Byte Order Mark, you can't unambiguosly determine whether big or
> little endian UTF-16 was used.
In the old days, UCS-2 was *implicitly* big-endian. It was only
when Microsoft got that wrong that little-endian version of UCS-2
came along. So while the BOM is now part of all relevant specifications,
it is still "Microsoft crap".
> For more details, see:
> http://www.unicode.org/faq/utf_bom.html#BOM
"some higher level protocols", "can be useful" - not
"is inherent part of all byte-level encodings".
Regards,
Martin
More information about the Python-list
mailing list