[Python-3000] Draft PEP for New IO system

Tue Feb 27 11:57:27 CET 2007

On 26/02/07, Mike Verdone <mike.verdone at gmail.com> wrote:
> Daniel Stutzbach and I have prepared a draft PEP for the new IO system
> for Python 3000. This document is, hopefully, true to the info that
> Guido wrote on the whiteboards here at PyCon. This is still a draft
> and there's quite a few decisions that need to be made. Feedback is
> welcomed.

Generally, this looks nice. A couple of minor points:

> The new IO spec is intended to be similar to the Java IO libraries,
> but generally less confusing. Programmers who don't want to muck about
> in the new IO world can expect that the open() factory method will
> produce an object backwards-compatible with old-style file objects.

Documenting the revised open() factory in this PEP would be useful. It
needs to address encoding issues, so it's not a simple copy of the
existing open().

Also, should there be a factory method for opening raw byte streams?
Once we start down this route, we open the can of worms, of course
(does socket.socket need to be specified in terms of the new IO
layers? what about the mmap module, the gzip/zipfile/tarfile modules,
etc?) These sould probably be noted in an "open issues" section, and
otherwise deferred for now.

> The BufferedReader implementation is for sequential-access read-only
> objects.  It does not provide a .flush() method, since there is no
> sensible circumstance where the user would want to discard the read
> buffer.

It's not something I've done personally, but programs sometimes flush
a read buffer before (eg) reading a password from stdin, to avoid
typeahead problems. I don't know if that would be relevant here.
>     .readlinesiter()
>     .__iter__()

I was going to object to the name readlinesiter, but I see it's gone already :-)

> Another way to do it is as follows (we should pick one or the other):
>
>     .__init__(self, buffer, encoding=None, newline=None)
>
>        Same as above but if newline is not None use that as the
> newline pattern (for reading and writing), and if newline is not set
> attempt to find the newline pattern from the file and if we can't for
> some reason use the system default newline pattern.

I'm not sure that can work - the point of universal newlines is that
*any* of \n, \r or \r\n count as a newline, so there's no one pattern.
So I think that explicitly specifying universal newlines is necessary
(even though it's clunky).

Regards,
Paul.