UTF-16 or something else?

Skip Montanaro skip.montanaro at gmail.com
Tue Feb 9 11:34:08 EST 2021


>
> It's UTF-8 with a UTF-16 BOM prepended, which is not uncommon when you
> have a file that's been converted to UTF-8 from UTF-16 or has been
> produced by shitty Microsoft software. You can tell instantly at a
> glance that it's not UTF-16 because the ascii dump would l.o.o.k.
> .l.i.k.e. .t.h.i.s.
>

Ah, right. Been a long, long while (well before Unicode was a thing) since
I needed to use od(1) and don't remember dealing with UTF-16 before.

Skip


More information about the Python-list mailing list