Generators.

Jorge Cardona jorgeecardona at gmail.com
Tue Dec 8 11:52:45 EST 2009


2009/12/8 Lie Ryan <lie.1296 at gmail.com>:
> First, I apologize for rearranging your message out of order.
>
> On 12/8/2009 5:29 AM, Jorge Cardona wrote:
>>>>
>>>> islice execute the function at the generator and drop the elements
>>>> that aren't in the slice. I found that pretty weird, the way that i
>>>> see generators is like an association between and indexing set (an
>>>> iterator or another generator) and a computation that is made indexed
>>>> by the indexing set, and islice is like a "transformation" on the
>>>> indexing set,it doesn't matter the result of the function, the slice
>>>> should act only on the indexing set, some other "transformation" like
>>>> takewhile act on the result so, the execution it has to be made, but
>>>> in the islice, or other "transformation" that act only in the indexing
>>>> set, the function shouldn't be executed at each element, but only on
>>>> that new set that result of the application of the "transformation" on
>>>> the original set.
>>>
>>> that seems like an extremely lazy evaluation, I don't know if even a true
>>> lazy language do that. Python is a strict language, with a few laziness
>>> provided by generators, in the end it's still a strict language.
>>>
>>
>> Yes, it looks like lazy evaluation, but, i don't see why there is not
>> a better control over the iterable associated to a generator, even
>> with Python that is a strict language, it will increase the
>> functionality of it, and the performance too, imagine that you pass a
>> function that takes 1 sec in run, and for several reason you can't
>> slice before (as the smp function that i want to create), the final
>> performance with the actual islice it gets really reduced
>> Just creating the separation between those transformation that act on
>> the index(islice, or tee) on those that act on the result(dropwhile,
>> takewhile, etc.) the control could be fine enough to increase
>> usability (that's the way i think right now), and you will be able to
>> combine generator without lose performance.
>
> Theoretically yes, but the semantic of generators in python is they work on
> an Iterable (i.e. objects that have __iter__), instead of a Sequence (i.e.
> objects that have __getitem__). That means semantically, generators would
> call obj.__iter__() and call the iter.__next__() and do its operation for
> each items returned by the iterator's iterable's __next__().
>
> The lazy semantic would be hard to fit the current generator model without
> changing the semantics of generator to require a Sequence that supports
> indexing.
>

Why?

The goal is add a formal way to separate the transformation of a
generator in those that act on the indexing set and those that act on
the result set.

Well, a little (and not so elaborated) example could be:

from itertools import islice


class MyGenerator:
    def __init__(self, function, indexing_set):
        self._indexing_set = (x for x in indexing_set)
        self._function = function

    def indexing_set(self):
        return (x for x in self._indexing_set)

    def result_set(self):
        return (self._function(x) for x in self.indexing_set())

    def function(self):
        return self._function

def f(x):
    print("eval: %d"%x)
    return x

def myslice(iterable, *args):
    return MyGenerator(iterable.f, islice(iterable.indexing_set(),*args))


g = MyGenerator(f, xrange(10))
print(list(g.result_set()))

g = MyGenerator(f, xrange(10))
new_g = myslice(g,0,None,2)
print(list(new_g.result_set()))

that returns:

eval: 0
eval: 1
eval: 2
eval: 3
eval: 4
eval: 5
eval: 6
eval: 7
eval: 8
eval: 9
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
eval: 0
eval: 2
eval: 4
eval: 6
eval: 8
[0, 2, 4, 6, 8]

I don't see why is needed add a sequence to support the indexing, but
some separation feature of the base components of the generator
(function, indexing_set).

>> Yes, it looks like lazy evaluation, but, i don't see why there is not
>> a better control over the iterable associated to a generator, even
>> with Python that is a strict language
>
> You can control the laziness by making it explicitly lazy:
>
> from functools import partial
> def f(x):
>    print("eval: %d"%x)
>    return x
>
> X = range(10)
> g = (partial(f, x) for x in X)
>
> print(list(x() for x in islice(g,0,None,2)))
> # # or without partial:
> # g = ((lambda: f(x)) for x in X)
> # print(list(f() for f in islice(g,0,None,2)))
>

I keep here the problem in that i shouldn't be able to define the
original generator because the function receive the already defined
generator.

> In a default-strict language, you have to explicitly say if you want lazy
> execution.
>
>> What i want to do is a function that receive any kind of generator and
>> execute it in several cores (after a fork) and return the data, so, i
>> can't slice the set X before create the generator.
>
> beware that a generator's contract is to return a valid iterator *once*
> only. You can use itertools.tee() to create more generators, but tee built a
> list of the results internally.

Oh, yes, i used tee first, but i note then that I wasn't using the
same iterator in  the same process, so, when the fork is made I can
use the initial generator in different processes without this problem,
so tee is not necessary in this case.

> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
Jorge Eduardo Cardona
jorgeecardona at gmail.com
jorgeecardona.blogspot.com
------------------------------------------------
Linux registered user  #391186
Registered machine    #291871
------------------------------------------------



More information about the Python-list mailing list