[Python-Dev] Unicode input issues

Guido van Rossum guido@python.org
Mon, 10 Apr 2000 14:11:29 -0400


> > Since you're calling methods on the underlying file object anyway,
> > can't you avoid buffering by calling the *corresponding* underlying
> > method and doing the conversion on that?
> 
> The problem here is that Unicode has far more line
> break characters than plain ASCII. The underlying API would
> break on ASCII lines (or even worse on those CRLF sequences
> defined by the C lib), not the ones I need for Unicode.

Hm, can't we just use \n for now?

> BTW, I think that we may need a new Codec class layer
> here: .readline() et al. are all text based methods,
> while the Codec base classes clearly work on all kinds of
> binary and text data.

Not sure what you mean here.  Can you explain through an example?

--Guido van Rossum (home page: http://www.python.org/~guido/)