[issue5323] document expected/required behavior of 3.x io subsystem with respect to buffering

R. David Murray report at bugs.python.org
Sat Feb 21 06:01:48 CET 2009


R. David Murray <rdmurray at bitdance.com> added the comment:

Heh, I was reading the code instead of the documentation.  Silly me :).

The 'buffering' argument of open, and the 'line_buffering' argument of
TextIOWrapper do address my documentation concern about whether or not
readline can be used to read lines from a source without blocking as
long as at least one line is available.  However, I do not see any
documentation of the relationship between readline and next, except the
negative one that no restriction on mixing them is documented.  Since
there used to be such a restriction, noting that there isn't one is
probably worthwhile.  Or perhaps better, a specification that '__next__'
calls 'readline'.  (And the behavior difference with respect to tell,
which I don't fully understand, should also be documented.)

TextIOWrapper says it wraps a BufferedIOBase object.  The doc for that
class talks about read possibly doing readahead for a 'non-interactive'
file.  What about pipes?  They are not mentioned, and it would be good
to have short reads on pipes for the same reason that it is good to have
short reads on ttys.  The IOBase only specifies an 'isatty' method.

It took me a while to understand what was going on well enough to write
the above paragraphs, partly because I read the code first :).  I'm
finding the documentation confusing, so I might as well talk about the
issues I ran into.

The implementation of TextIOWrapper calls the buffer's 'read1' method in
_read_chunk.  But BufferedIOBase does not define read1, nor does its
base class, IOBase.  open does pass TextIOWrapper a BufferedReader
object which does define read1, which is why it works, but I don't see
any documentation saying that the buffer argument to TextIOWrapper must
provide a 'read1' method with certain semantics.  The docs indicate
BufferedIOBase is the minimum requirement.  Is the definition of 'read1'
missing from BufferedIOBase?

I also notice that while open passes TextIOWrapper an appropriate value
for line_buffering (assuming we ignore the pipe issue), TextIOWrapper
ignores it for reading.  I suppose that that is an implementation detail
only.

After staring at it for a while, I finally came to an understanding of
the statement in BufferedIOBase that "A typical implementation should
not inherit from a RawIOBase implementation, but wrap one like
BufferedWriter and BufferedReader."  I understand this to mean that
BufferedIOBase is really an ABC and should be concretized as a wrapper
rather than doing multiple inheritance with RawIOBase.  I don't think I
would have figured that out without reading the code, though.  I believe
that adding the word 'do' at the end of that sentence would make the
meaning clear.

A similarly confusing bit is the first sentence of the documentation of
TextIOWrapper.  Currently it says "A buffered text stream over a
BufferedIOBase raw stream..."  I think it would be clearer if it said
something like "A buffered text stream wrapper around a BufferedIOBase
derived wrapper around a raw stream..."  That's a bit unwieldy, but it
is nontheless (IMO) more comprehensible.

I will look at the tests more carefully and make sure that all my use
cases are in fact covered.

--RDM

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5323>
_______________________________________


More information about the Python-bugs-list mailing list