[Python-3000] On PEP 3116: new I/O base classes

Thu Jun 21 02:54:17 CEST 2007

On 6/20/07, Bill Janssen <janssen at parc.com> wrote:
> Yes, of course, Daniel, but I was speaking of the contents of files,
> and files are inherently sequences of bytes.  If we are talking about
> some layer which interprets the contents of a file, just saying "give
> me N characters" isn't enough.  We need to say, "N characters assuming
> a text encoding of M, with a normalization policy of Q, and a newline
> policy of R".  If we don't, we can't just "read" N characters safely.
> So I think it's broken to put this in the TextIOBase class; instead,
> there should be some wrapper class that does buffering and can be
> configured as to (M, Q, R).

The PEP specifies that TextIOWrapper objects (the primary
implementation of the TextIOBase interface) are created via the
following signature:

    .__init__(self, buffer, encoding=None, newline=None)

In other words, TextIOBase *is* the wrapper type that does the
buffering and allows the user to configure M and R.

Are you suggesting that TextIOBase should be split into two classes,
one of which provides the (M, R) functionality and one of which does
not?  If so, how would the later be different from the RawIOBase and
BufferedIOBase classes, already described in the PEP?

I'm not sure I 100% understand what you mean by "normalization policy"
(Q).  Could you give an example?

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC