[Python-ideas] Py3 unicode impositions

Niki Spahiev niki.spahiev at gmail.com
Wed Feb 15 09:52:45 CET 2012


On 14.02.2012 23:08, Paul Moore wrote:
> Maybe we could add a note to the open()
> documentation, something like the following:
>
> """To open a file, you need to know its encoding. This is not always
> obvious, depending on where the file came from, among other things.
> Other tools can process files without knowing the encoding by assuming
> the bytes of the file map 1-1 to the first 256 Unicode characters.
> This can cause issues such as mojibake or corrupted data, but for
> casual use is sometimes sufficient. To get this behaviour in Python
> (with all the same risks and problems) you can use the "latin1"
> encoding, which maps bytes to unicode as described above. It is far,
> far better to use the correct encoding declaration, if at all
> possible, however."""

IMHO it's better to make 'unknown' encoding alias to 'latin1'.
This way one can find and change it later.

Niki




More information about the Python-ideas mailing list