[Python-Dev] open(): set the default encoding to 'utf-8' in Python 3.3?

M.-A. Lemburg mal at egenix.com
Tue Jun 28 16:02:52 CEST 2011


Victor Stinner wrote:
> In Python 2, open() opens the file in binary mode (e.g. file.readline()
> returns a byte string). codecs.open() opens the file in binary mode by
> default, you have to specify an encoding name to open it in text mode.
> 
> In Python 3, open() opens the file in text mode by default. (It only
> opens the binary mode if the file mode contains "b".) The problem is
> that open() uses the locale encoding if the encoding is not specified,
> which is the case *by default*. The locale encoding can be:
> 
>  - UTF-8 on Mac OS X, most Linux distributions
>  - ISO-8859-1 os some FreeBSD systems
>  - ANSI code page on Windows, e.g. cp1252 (close to ISO-8859-1) in
> Western Europe, cp952 in Japan, ...
>  - ASCII if the locale is manually set to an empty string or to "C", or
> if the environment is empty, or by default on some systems
>  - something different depending on the system and user configuration...
> 
> If you develop under Mac OS X or Linux, you may have surprises when you
> run your program on Windows on the first non-ASCII character. You may
> not detect the problem if you only write text in english... until
> someone writes the first letter with a diacritic.

How about a more radical change: have open() in Py3 default to
opening the file in binary mode, if no encoding is given (even
if the mode doesn't include 'b') ?

That'll make it compatible to the Py2 world again and avoid
all the encoding guessing.

Making such default encodings depend on the locale has already
failed to work when we first introduced a default encoding in
Py2, so I don't understand why we are repeating the same
mistake again in Py3 (only in a different area).

Note that in Py2, Unix applications often leave out the 'b'
mode, since there's no difference between using it or not.
Only on Windows, you'll see a difference.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 28 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Python-Dev mailing list