[Python-Dev] Unicode byte order mark decoding
Walter Dörwald
walter at livinglogic.de
Thu Apr 7 23:47:07 CEST 2005
Walter Dörwald sagte:
> Nicholas Bastin sagte:
>
> It should be feasible to implement your own codec for that
> based on Lib/encodings/utf_16.py. Simply replace the line
> in StreamReader.decode():
> raise UnicodeError,"UTF-16 stream does not start with BOM"
> with:
> self.decode = codecs.utf_16_be_decode
> and you should be done.
Oops, this only works if you have a big endian system.
Otherwise you have to redecode the input with:
codecs.utf_16_ex_decode(input, errors, 1, False)
Bye,
Walter Dörwald
More information about the Python-Dev
mailing list