Is this a bug? BOM decoded with UTF8

Kent Johnson kent37 at tds.net
Fri Feb 11 08:44:19 EST 2005


Diez B. Roggisch wrote:
>>I know its easy (string.replace()) but why does UTF-16 do
>>it on its own then? Is that according to Unicode standard or just
>>Python convention?
> 
> 
> BOM is microsoft-proprietary crap. 

Uh, no. BOM is part of the Unicode standard. The intent is to allow consumers of Unicode text files 
to disambiguate UTF-8, big-endian UTF-16 and little-endian UTF-16.
See http://www.unicode.org/faq/utf_bom.html#BOM

Kent



More information about the Python-list mailing list