proposed PEP: iterator splicing

Anton Vredegoor anton.vredegoor at gmail.com
Sun Apr 15 04:23:33 EDT 2007


Paul Rubin wrote:

>     def some_gen():
>        ...
>        yield *some_other_gen()
> 
> comes to mind.  Less clutter, and avoids yet another temp variable
> polluting the namespace.
> 
> Thoughts?

Well, not directly related to your question, but maybe these are some 
ideas that would help determine what we think generators are and what we 
would like them to become.

I'm currently also fascinated by the new generator possibilities, for 
example sending back a value to the generator by making yield return a 
value. What I would like to use it for is when I have a very long 
generator and I need just a slice of the values. That would mean running 
through a loop, discarding all the values until the generator is in the 
desired state and only then start doing something with the output. 
Instead I would like to directly set or 'wind' -like a file- a generator 
into some specific state. That would mean having some convention for 
generators signaling their length (like xrange):

 >>> it = xrange(100)
 >>> len(it)
100
 >>>

Note: xrange didn't create a list, but it has a length!

Also we would need some convention for a generator to signal that it can 
jump to a certain state without computing all previous values. That 
means the computations are independent and could for example be 
distributed across different processors or threads.

 >>> it = range(100)
 >>> it[50:]
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 
86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
 >>>

But currently this doesn't work for xrange:

 >>> it = xrange(100)
 >>> it[50:]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: sequence index must be integer
 >>>

Even though xrange *could* know somehow what its slice would look like.

Another problem I have is with the itertools module:

 >>> itertools.islice(g(),10000000000000000)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
ValueError: Stop argument must be a non-negative integer or None.
 >>>

I want islice and related functions to use long integers for indexing. 
But of course this only makes sense when there already are generator 
slices possible, else there would be no practical way to reach such big 
numbers by silently looping through the parts of the sequence until one 
reaches the point one is interested in.

I also have thought about the thing you are proposing, there is 
itertools.chain of course but that only works when one can call it from 
'outside' the generator.

Suppose one wants to generate all unique permutations of something. One 
idea would be to sort the sequence and then start generating successors 
until one reaches the point the sequence is completely reversed. But 
what if one wants to start with the actual state the sequence is in? One 
could generate successors until one reaches the 'end' and then continue 
by generating successors from the 'beginning' until one reaches the 
original state. Note that by changing the cmp function this generator 
could also iterate in reverse from any point. There only would need to 
be a way to change the cmp function of a running generator instance.

from operator import ge,le
from itertools import chain

def mutate(R,i,j):
         a,b,c,d,e = R[:i],R[i:i+1],R[i+1:j],R[j:j+1],R[j+1:]
         return a+d+(c+b+e)[::-1]

def _pgen(L, cmpf = ge):
     R = L[:]
     yield R
     n = len(R)
     if n >= 2:
         while True:
             i,j = n-2,n-1
             while cmpf(R[i],R[i+1]):
                 i -= 1
                 if i == -1:
                     return
             while cmpf(R[i],R[j]):
                 j -= 1
             R = mutate(R,i,j)
             yield R

def perm(L):
     F = _pgen(L)
     B = _pgen(L,le)
     B.next()
     return chain(F,B)

def test():
     P = '12124'
     g = perm(P)
     for i,x in enumerate(g):
         print '%4i) %s' %(i, x)

if __name__ == '__main__':
     test()

A.







More information about the Python-list mailing list