Readlines returns non ASCII character

MRAB python at mrabarnett.plus.com
Wed Sep 23 20:09:58 EDT 2015


On 2015-09-24 00:51, paul.hermeneutic at gmail.com wrote:
>   If this starts at the beginning of the file, then it indicates that
> the file is UTF-16 (LE).
>
> UTF-8[t 1]     EF BB BF       239 187 191
> UTF-16 (BE)    FE FF          254 255
> UTF-16 (LE)    FF FE          255 254
> UTF-32 (BE)    00 00 FE FF    0 0 254 255
> UTF-32 (LE)    FF FE 00 00    255 254 0 0
>
The "signature" EF BB BF indicates the encoding called "utf-8-sig" by
Python. It occurs on Windows.

If the file doesn't start with any of these, then it could be using any
encoding (except UTF-16 or UTF-32).




More information about the Python-list mailing list