Generator question

Pierre Reinbold preinbold at gmx.net
Thu Mar 14 05:04:04 EDT 2019


Wow, thank you Ian for this very detailed answer, and thank you for taking the
time for that! Much appreciated!

If I get this right, I have to somehow fix the value of a_list during the loop,
like when you use the classic default value argument trick with lambdas (damn, I
should have thought of that!). So if I "unfold" the generator expression, using
default values for both iterables, I get this :

def flat_gen_cat_prod(lists):
    solutions = [[]]
    for a_list in lists:
        def new_solutions(l=a_list, s=solutions):
            for part_sol in s:
                for el in l:
                    yield part_sol+[el]
        solutions = new_solutions()
    return solutions

With this I get the right behavior! Thanks again!

Doest that mean that there is no possibility to use a generator expression in
this case ? (gen. exp. are sooo much more elegant :-))

Thanks again for the answer and helpful comments!


πr

Le 14/03/19 à 02:44, Ian Kelly a écrit :
> You're basically running into this:
> https://docs.python.org/3/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result
> 
> To see why, let's try disassembling your function. I'm using Python 3.5
> here, but it shouldn't make much of a difference.
> 
> py> import dis
> py> dis.dis(flat_genexp_cat_prod)
>   2           0 BUILD_LIST               0
>               3 BUILD_LIST               1
>               6 STORE_FAST               1 (solutions)
> 
>   3           9 SETUP_LOOP              39 (to 51)
>              12 LOAD_FAST                0 (lists)
>              15 GET_ITER
>         >>   16 FOR_ITER                31 (to 50)
>              19 STORE_DEREF              0 (a_list)
> 
>   4          22 LOAD_CLOSURE             0 (a_list)
>              25 BUILD_TUPLE              1
>              28 LOAD_CONST               1 (<code object <genexpr> at
> 0x73f31571ac90, file "<stdin>", line 4>)
>              31 LOAD_CONST               2
> ('flat_genexp_cat_prod.<locals>.<genexpr>')
>              34 MAKE_CLOSURE             0
>              37 LOAD_FAST                1 (solutions)
>              40 GET_ITER
>              41 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
>              44 STORE_FAST               1 (solutions)
>              47 JUMP_ABSOLUTE           16
>         >>   50 POP_BLOCK
> 
>   5     >>   51 LOAD_FAST                1 (solutions)
>              54 RETURN_VALUE
> 
> Now, take a look at the difference between the instruction at address 22
> and the one at address 37:
> 
>   4          22 LOAD_CLOSURE             0 (a_list)
>              37 LOAD_FAST                1 (solutions)
> 
> The value of solutions is passed directly to the generator as an argument,
> which is the reason why building the generator up iteratively like this
> works at all: although the nested generators are evaluated lazily, each new
> generator that is constructed contains as its input a reference to the
> previous generator.
> 
> By contrast, the value of a_list is a closure. The contents of the closure
> are just whatever the value of a_list is when the generator gets evaluated,
> not when the generator was created. Since the entire nested generated
> structure is evaluated lazily, it doesn't get evaluated until list() is
> called after the function has returned. The value of the a_list closure at
> that point is the last value that was assigned to it: the list [5, 6] from
> the last iteration of the for loop. This same list value then gets used for
> all three nested generators.
> 
> So now why do solutions and a_list get treated differently like this? To
> answer this, look at this paragraph about generator expressions from the
> language reference:
> 
> """
> Variables used in the generator expression are evaluated lazily when the
> __next__() method is called for the generator object (in the same fashion
> as normal generators). However, the iterable expression in the leftmost for
> clause is immediately evaluated, so that an error produced by it will be
> emitted at the point where the generator expression is defined, rather than
> at the point where the first value is retrieved. Subsequent for clauses and
> any filter condition in the leftmost for clause cannot be evaluated in the
> enclosing scope as they may depend on the values obtained from the leftmost
> iterable. For example: (x*y for x in range(10) for y in range(x, x+10)).
> """
> 
> So, it's simply because the iterable expression in the leftmost for clause
> is treated differently from every other value in the generator expression.
> 
> On Wed, Mar 13, 2019 at 3:49 PM Pierre Reinbold <preinbold at gmx.net> wrote:
> 
>> Dear all,
>>
>> I want to implement a function computing the Cartesian product if the
>> elements
>> of a list of lists, but using generator expressions. I know that it is
>> already
>> available in itertools but it is for the sake of understanding how things
>> work.
>>
>> I already have a working recursive version, and I'm quite sure that this
>> iterative version used to work (at least in some Python2.X) :
>>
>> def flat_genexp_cat_prod(lists):
>>     solutions = [[]]
>>     for a_list in lists:
>>         solutions = (part_sol+[el] for part_sol in solutions for el in
>> a_list)
>>     return solutions
>>
>> But, with Python3.7.2, all I got is this :
>>
>>>>> list(flat_genexp_cat_prod([[1, 2], [3, 4], [5, 6]]))
>> [[5, 5, 5], [5, 5, 6], [5, 6, 5], [5, 6, 6], [6, 5, 5], [6, 5, 6], [6, 6,
>> 5],
>> [6, 6, 6]]
>>
>> instead of
>>
>>>>> list(flat_genexp_cat_prod([[1, 2], [3, 4], [5, 6]]))
>> [[1, 3, 5], [1, 3, 6], [1, 4, 5], [1, 4, 6], [2, 3, 5], [2, 3, 6], [2, 4,
>> 5],
>> [2, 4, 6]]
>>
>> Using a list comprehension instead of a generator expression solves the
>> problem,
>> but I can't understand why the version above fails.
>>
>> Even stranger, when debugging I tried to use itertools.tee to duplicate the
>> solutions generators and have a look at them :
>>
>> def flat_genexp_cat_prod(lists):
>>     solutions = [[]]
>>     for a_list in lists:
>>         solutions, debug = tee(
>>                 part_sol+[el] for part_sol in solutions for el in a_list)
>>         print("DEBUG", list(debug))
>>     return solutions
>>
>> And, that version seems to work!
>>
>>>>> list(flat_genexp_cat_prod([[1, 2], [3, 4], [5, 6]]))
>> DEBUG [[1], [2]]
>> DEBUG [[1, 3], [1, 4], [2, 3], [2, 4]]
>> DEBUG [[1, 3, 5], [1, 3, 6], [1, 4, 5], [1, 4, 6], [2, 3, 5], [2, 3, 6],
>> [2, 4,
>> 5], [2, 4, 6]]
>> [[1, 3, 5], [1, 3, 6], [1, 4, 5], [1, 4, 6], [2, 3, 5], [2, 3, 6], [2, 4,
>> 5],
>> [2, 4, 6]]
>>
>> Can you help me understand what I'm doing wrong ?
>>
>> Thank you by advance,
>>
>>
>> πr
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>




More information about the Python-list mailing list