[Python-Dev] Python 3.0.1 (io-in-c)

Terry Reedy tjreedy at udel.edu
Wed Jan 28 20:03:31 CET 2009


Steven Bethard wrote:
> On Wed, Jan 28, 2009 at 10:29 AM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> Notice that the determination of the specific encoding used is fairly
>> elaborate:
>> - if IO is to a terminal, Python tries to determine the encoding of
>>  the terminal. This is mostly relevant for Windows (which uses,
>>  by default, the "OEM code page" in the terminal).
>> - if IO is to a file, Python tries to guess the "common" encoding
>>  for the system. On Unix, it queries the locale, and falls back
>>  to "ascii" if no locale is set. On Windows, it uses the "ANSI
>>  code page". On OSX, it uses the "system encoding".
>> - if IO is binary, (clearly) no encoding is used. Network IO is
>>  always binary.
>> - for file names, yet different algorithms apply. On Windows, it
>>  uses the Unicode API, so no need for an encoding. On Unix, it
>>  (again) uses the locale encoding. On OSX, it uses UTF-8
>>  (just to be clear: this applies to the first argument of open(),
>>   not to the resulting file object)
> 
> This a very helpful explanation. Is it in the docs somewhere, or if it
> isn't, could it be?

Here is the  current entry on encodings in the Lib ref, built-in types, 
file objects.

file.encoding
The encoding that this file uses. When strings are written to a file, 
they will be converted to byte strings using this encoding. In addition, 
when the file is connected to a terminal, the attribute gives the 
encoding that the terminal is likely to use (that information might be 
incorrect if the user has misconfigured the terminal). The attribute is 
read-only and may not be present on all file-like objects. It may also 
be None, in which case the file uses the system default encoding for 
converting strings.



More information about the Python-Dev mailing list