[Python-3000] Google Sprint Ideas

Mon Aug 21 03:06:02 CEST 2006

On 8/20/06, Talin <talin at acm.org> wrote:
> Guido van Rossum wrote:
> > On 8/20/06, Paul Moore <p.f.moore at gmail.com> wrote:
>
> > Without endorsing every detail of his design, tomer filiba has written
> > several blog (?) entries about this, the latest being
> > http://sebulba.wikispaces.com/project+iostack+v2 . You can also look
> > at sandbox/sio/sio.py in svn.
>
> One comment after reading this: If we're going to re-invent the Java/C#
> i/o library, could we at least use the same terminology? In particular,
> the term "Layer" has connotations which may be confusing in this context
> - I would prefer something like "Adapter" or "Filter".

That's an example of what I meant when I said "without endorsing every detail".

I don't know which terminology C++ uses beyond streams. I think Java
uses Streams for the lower-level stuff and Reader/Writer for the
higher-level stuff -- or is it the other way around?

> Also, I notice that this proposal removes what I consider to be a nice
> feature of Python, which is that you can take a plain file object and
> iterate over the lines of the file -- it would require a separate line
> buffering adapter to be created. I think I understand the reasoning
> behind this - in a world with multiple text encodings, the definition of
> "line" may not be so simple. However, I would assume that the "built-in"
> streams would support the most basic, least-common-denominator encodings
> for convenience.

First time I noticed that. But perhaps it's the concept of "plain file
object" that changed? My own hierarchy (which I arrived at without
reading tomer's proposal) is something like this:

(1) Basic level (implemented in C) -- open, close, read, write, seek,
tell. Completely unbuffered, maps directly to system calls. Does
binary I/O only.

(2) Buffering. Implements the same API as (1) but adds buffering. This
is what one normally uses for binary file I/O. It builds on (1), but
can also be built on raw sockets instead. It adds an API to inquire
about the amount of buffered data, a flush() method, and ways to
change the buffer size.

(3) Encoding and line endings. Implements a somewhat different API,
for reading/writing text files; the API resembles Python 2's I/O
library more. This is where readline() and next() giving the next line
are implemented. It also does newline translation to/from the
platform's native convention (CRLF or LF, or perhaps CR if anyone
still cares about Mac OS <= 9) and Python's convention (always \n). I
think I want to put these two features (encoding and line endings) in
the same layer because they are both text related. Of course you can
specify ASCII or Latin-1 to effectively disable the encoding part.

Does this make more sense?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)