[I18n-sig] UTF-8 and BOM
Paul Prescod
paulp@ActiveState.com
Wed, 16 May 2001 15:26:56 -0700
"M.-A. Lemburg" wrote:
>
>...
>
> You have to be careful here: UTF-16 prepends a BOM mark to
> every string pushed through the codec -- even small snippets.
> You certainly don't want to make that the default for the
> much more common UTF-8 which has no real requirement to include
> BOM marks at all... having the decoder automatically remove
> BOM marks is easy to implement and won't cause any harm,
> but carelessly adding them will get us into trouble.
Yes, I meant to say that the standard decoder should remove them and
left it up to you whether we should have another codec where the encoder
adds them.
--
Take a recipe. Leave a recipe.
Python Cookbook! http://www.ActiveState.com/pythoncookbook