streams (was: Re: Itertools)

Beni Cherniavsky cben at techunix.technion.ac.il
Thu Jul 31 09:12:59 EDT 2003


Mike Rovner wrote on 2003-07-30:

> Beni Cherniavsky wrote:
> > Any feedback welcome.  I'd like to make it as Pythonic as possible.
>
> > An perhaps make it standard.  Particular open questions:
>
> >   - The linked lists are called "streams" by me because they are lazy
> >     and that's what such beasts are called in functional languages.
>
> Stream is used differently in Tcl and Unix.  Minded separation
> Python from functional approach (deprecation of filter, map;
> doubtfulness of implementing tail recursion optimization) probably
> it is not so inspiring for regular python programmer.
>
Point taken.  But what do you call it then?  Lazy linked lists has
only one single-word name that I'm aware of and that is "streams";
there is no name for it in Tcl and Unix becauit is almost never used
in them, as far as I know.

> Lazy "streams" are a good thing provided they can be easily used for
> organizing data pathes and connections.
>
Python already provides the minimal construct for connecting producers
and consumers of data: the iterator protocol.  It is minimalistic and
becomes inconvenient when you want to use the same value more than one
time, for lookahead, backtracking or simply connecting the same source
to many consumers.  All this is quite easy with the head/tail access
to streams; it's a very powerfull abstraction.  Since streams are
implemented as linked lists, they don't leak memory for infinite
iterators, unless you keep a reference to a fixed position in the
stream.  Over streams I've built an iterator which also provides the
same powers but more in line with Python's iterator protocol.

> >     - `Cons` still requires lisp background but I don't know any name
> >       that would be recognizable to non-lispers anyway.  And it's not
> >       a bad name.
>
> Construct, Glue, Merge?
> Taken readability any whole word is better.
>
`Construct` is too long.  Compare with `str`, `int`, `dict` rather
than `string`, `integer` and `dictionary`.

How would `Glue` and `Merge` be meaningful here?  The only
associasions of "glue" are what TeX puts between boxes and "glue
layers" between different languages or libraries.  "Merge" sounds like
the process of taking several sequences of values and interlevaing
them in some way (like in merge sort; BTW, it needs a lookahead of 1
for each stream, so it should be convenient to implement with
streams).  But this has nothing to do with what `Cons` does: creating
a new stream by prepending given head to given tail.  `Cons` is the
only term I know of that unabigously refers to this operation, again
coming from functional languages <wink>.

There is another possibility: rename `Cons` to `Stream` and `Stream`
to `LazyStream`; `Stream` would then be a factory function creating
either depending on whether it got 1 or two arguments.  Would this be
better?

> >     - I called the operation for creating new iterator from same place
> >       "forking".  Alternative names: `split`, `copy`, `clone`, `dup`,
> >       "lookahead", etc.  What's best?
>
> copy is already using construct in python. Why not use it?
>
Copy was adviced against by Erik Max Francis, see:

http://groups.google.com/groups?selm=3EEF7CA0.623C201C%40alcyone.com

However, it really copies the iterator, it's just very fast.  So I
think I will indeed name this `.copy()` (aliased as `.__copy__`).

> >   - Anything else anybody needs?
>
> If I got the intend of Stream correctly, I wish use them as Tcl streams,
> so I'd need combining and filtering and i/o.
>
You can use a `Stream` or `StreamIter` over a file object to get
lookahead and mulitple passes over it, without requiring the file to
be seekable!  That's one of the main uses.  It also gives you ability
to "unread" any amount of data.  Note that if you simply apply it to a
file the stream operates on a line at a time; you will need to wrap
the file with a custom generator if you wish to operate at character
level; working at token level might be the best compromise.

I'm not familiar with Tcl streams, so I don't understand the
"combining and filtering" reference - can you elaborate?

-- 
Beni Cherniavsky <cben at tx.technion.ac.il>

Put a backslash at the evening to continue hacking onto the next day.





More information about the Python-list mailing list