io.open vs. codecs.open

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Mar 4 14:56:08 EST 2015


Albert-Jan Roskam wrote:

> Hi,
> 
> Is there a (use case) difference between codecs.open and io.open? What is
> the difference? A small difference that I just discovered is that
> codecs.open(somefile).read() returns a bytestring if no encoding is
> specified*), but a unicode string if an encoding is specified. io.open
> always returns a unicode string.

What version of Python are you using?

In Python 3, io.open is used as the built-in open. I believe this is
guaranteed, and not just an implementation detail.

The signatures and capabilities are quite different:

codecs.open:

open(filename, mode='rb', encoding=None, errors='strict', buffering=1)

io.open:

open(file, mode='r', buffering=-1, encoding=None,
         errors=None, newline=None, closefd=True, opener=None)

io.open does *not* always produce Unicode strings. If you pass 'rb' as the
mode, the file is opened in binary mode, not text mode, and the read()
method will return bytes.

As usual, help() in the interactive interpreter is your friend.
help(codecs.open) and help(io.open) will explain the many differences
between them, including that codecs.open always opens the file in binary
mode.

As for use-cases, I think that codecs.open is mostly a left-over from the
Python 2 days when the built-in open had a much simpler interface and fewer
capabilities. In Python 2, built-in open doesn't take an encoding argument,
so if you want to use something other than binary mode or the default
encoding, you were supposed to use codecs.open.

In Python 2.6, the io module was added to Python 2 to aid in porting to
Python 3. The docs say:

    New in version 2.6.

    The io module provides the Python interfaces to stream handling.
    Under Python 2.x, this is proposed as an alternative to the
    built-in file object, but in Python 3.x it is the default
    interface to access files and streams.

https://docs.python.org/2/library/io.html


To summarise:

* In Python 2, if you want to supply an encoding to open, use codecs.open
(before 2.6) or io.open (2.6 and later);

* If you want the enhanced capabilities of Python 3 open, use io.open;

* In Python 3, io.open is the same thing as built-in open;

* And codecs.open is (I think) mostly there for backwards compatibility.




-- 
Steven




More information about the Python-list mailing list