[Python-Dev] Better text processing support in py2k?

M.-A. Lemburg mal@lemburg.com
Wed, 29 Dec 1999 17:55:21 +0100


Andy Robinson wrote:
> 
> --- Skip Montanaro <skip@mojam.com> wrote:
> >     fast/memory-intensive/clear
> >     slow/memory-conserving/not-as-clear
> >     fast/memory-conserving/fairly-muddy
> >
> > Any particular reason that the readline method can't
> > return an iterator that
> > supports __getitem__ and buffers input?  (Again,
> > remember this is for py2k,
> > so the potential breakage such a change might cause
> > is a consideration, but
> > not a showstopper.)
> 
> Why not generalize fileinput to do buffering instead?
> 
> More generally, Java has the notion of 'stackable
> streams' - e.g. construct a 'BufferedFile' around a
> 'File', maybe construct a 'Line-oriented file' around
> that etc.  Each one takes a file-like object as an
> argument to the constructor.  Things you might want to
> do:
> - buffering
> - international encoding conversions
> - line delimiters other than CR/LF/CRLF
> - read/write Python objects (i.e. use pickle/marshal)
> - easy interfaces to parsers

If all goes well we'll have something like this
in Python 1.6 at least for the encoding/decoding
part file reading and writing. You basically take
a file object and then wrap some StreamCodecs around
it to get the functionality you need. Very simple
and very intuitive.

> This took me a couple of hours to get used to (and at
> the time I thought 'Yuk!' when I saw first saw four
> nested constructors), but gives you very precise
> control and a lot of versatility when handling files.
> It's an idiom Python does not use much but maybe it
> should.
> 
> I'd argue that maybe some enhancements to fileinput.py
> - adding some streams to provide building blocks for
> these operations - would get us the power you want and
> a lot more versatility besides.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Get ready to party !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/