[Python-Dev] Unicode byte order mark decoding

Thu Apr 7 23:47:07 CEST 2005

Walter Dörwald sagte:

> Nicholas Bastin sagte:
>
> It should be feasible to implement your own codec for that
> based on Lib/encodings/utf_16.py. Simply replace the line
> in StreamReader.decode():
>   raise UnicodeError,"UTF-16 stream does not start with BOM"
> with:
>   self.decode = codecs.utf_16_be_decode
> and you should be done.

Oops, this only works if you have a big endian system.
Otherwise you have to redecode the input with:
   codecs.utf_16_ex_decode(input, errors, 1, False)

Bye,
   Walter Dörwald