streams (was: Re: Itertools)

Mike Rovner mike at nospam.com
Thu Aug 7 22:28:31 EDT 2003


Beni Cherniavsky wrote:
> Mike Rovner wrote on 2003-07-30:

I'm sorry for taking so long to answer (I had a project deadline to meet
;) - done).

>> Beni Cherniavsky wrote:
> Point taken.  But what do you call it then?  Lazy linked lists has
> only one single-word name that I'm aware of and that is "streams";
> there is no name for it in Tcl and Unix becauit is almost never used
> in them, as far as I know.

I made an intensive search on internet and convinced now that term 'streams'
is globaly associated with i/o. I still think using this term for universal
data
structure will be misleading. However I can't come with a good word
(my best is LazyList).

>>>     - `Cons` still requires lisp background but I don't know any
>>>       name that would be recognizable to non-lispers anyway.  And
>>>       it's not a bad name.
>>
>> Construct, Glue, Merge?
>> Taken readability any whole word is better.
>>
> `Construct` is too long.  Compare with `str`, `int`, `dict` rather
> than `string`, `integer` and `dictionary`.

But they all are (long-lived) built-ins. AFAIK now there is a tendency
to make language more readable (isinstance, enumerate to name a few
hmm recent additions)

> How would `Glue` and `Merge` be meaningful here?  The only
> associasions of "glue" are what TeX puts between boxes and "glue
> layers" between different languages or libraries.  "Merge" sounds like
> the process of taking several sequences of values and interlevaing
> them in some way (like in merge sort; BTW, it needs a lookahead of 1
> for each stream, so it should be convenient to implement with
> streams).  But this has nothing to do with what `Cons` does: creating
> a new stream by prepending given head to given tail.  `Cons` is the
> only term I know of that unabigously refers to this operation, again
> coming from functional languages <wink>.
>
> There is another possibility: rename `Cons` to `Stream` and `Stream`
> to `LazyStream`; `Stream` would then be a factory function creating
> either depending on whether it got 1 or two arguments.  Would this be
> better?

No. In discussion about list.insert() and/or range() difference in behaivor
dependent on args was prononced A Bad Thing (IIRC).

>>>   - Anything else anybody needs?
>>
>> If I got the intend of Stream correctly, I wish use them as Tcl
>> streams, so I'd need combining and filtering and i/o.
>>
> You can use a `Stream` or `StreamIter` over a file object to get
> lookahead and mulitple passes over it, without requiring the file to
> be seekable!  That's one of the main uses.  It also gives you ability
> to "unread" any amount of data.  Note that if you simply apply it to a
> file the stream operates on a line at a time; you will need to wrap
> the file with a custom generator if you wish to operate at character
> level; working at token level might be the best compromise.
>
> I'm not familiar with Tcl streams, so I don't understand the
> "combining and filtering" reference - can you elaborate?

I didn't touch tcl for several years ;)
But the idea is pretty simple: to pass file object throw 'filter' to get
'another' file(-like) object.

import compress as Z
for line in tarfile.open('-','r',Z.open(filename)).extractiter()

instead of

stream = tarfile.open('-','r',Z.open(filename))
for elem in tarfile.open('-','r',Z.open(filename)).getmembers():
  for line in stream.extract(elem):
    ...

or something like that. I admit that later example based on new 2.3 syntax
fulfill my (long being) expectations quite well.

Thanks,
Mike








More information about the Python-list mailing list