[Python-3000] Pre-PEP: Easy Text File Decoding
"Martin v. Löwis"
martin at v.loewis.de
Mon Oct 2 22:20:01 CEST 2006
John S. Yates, Jr. schrieb:
> It is a mistake on Microsoft's part to fail to strip the BOM
> during conversion to UTF-8. There is no MEANINGFUL definition
> of BOM in a UTF-8 string.
That's not true. See
http://unicode.org/faq/utf_bom.html#23
http://unicode.org/faq/utf_bom.html#29
The BOM can also serve as an encoding marker. I refer to the
BOM encoded in UTF-8 as "UTF-8 signature". As such, it is
very meaningful. Usage of the BOM in UTF-8-encoded text
is deliberate.
Regards,
Martin
More information about the Python-3000
mailing list