itertools: problem with nested groupby, list()

Nico Schlömer nico.schloemer at gmail.com
Tue May 4 07:36:06 EDT 2010


> Does this example help at all?

Thanks, that clarified things a lot!

To make it easier, let's just look at 'a' and 'b':


> my_list.sort( key=itemgetter('a','b','c') )
> for a, a_iter in groupby(my_list, itemgetter('a')):
>    print 'New A', a
>    for b, b_iter in groupby(a_iter, itemgetter('b')):
>        print '\t', 'New B', b
>        for b_data in b_iter:
>            print '\t'*3, a, b, b_data
>        print '\t', 'End B', b
>    print 'End A', a

That works well, and I can wrap the outer loop in another loop without
problems. What's *not* working, though, is having more than one pass
on the inner loop, as in

=============================== *snip* ===============================
my_list.sort( key=itemgetter('a','b','c') )
for a, a_iter in groupby(my_list, itemgetter('a')):
   print 'New A', a
   for pass in ['first pass', 'second pass']:
       for b, b_iter in groupby(a_iter, itemgetter('b')):
           print '\t', 'New B', b
           for b_data in b_iter:
               print '\t'*3, a, b, b_data
           print '\t', 'End B', b
       print 'End A', a
=============================== *snap* ===============================

I tried working around this by

=============================== *snip* ===============================
my_list.sort( key=itemgetter('a','b','c') )
for a, a_iter in groupby(my_list, itemgetter('a')):
   print 'New A', a
   inner_list =  list( groupby(a_iter, itemgetter('b')) )
   for pass in ['first pass', 'second pass']:
       for b, b_iter in inner_list:
           print '\t', 'New B', b
           for b_data in b_iter:
               print '\t'*3, a, b, b_data
           print '\t', 'End B', b
       print 'End A', a
=============================== *snap* ===============================

which don't work either, and I don't understand why. -- I'll look at
Uli's comments.

Cheers,
Nico



On Tue, May 4, 2010 at 1:08 PM, Jon Clements <joncle at googlemail.com> wrote:
> On 4 May, 11:10, Nico Schlömer <nico.schloe... at gmail.com> wrote:
>> Hi,
>>
>> I ran into a bit of an unexpected issue here with itertools, and I
>> need to say that I discovered itertools only recently, so maybe my way
>> of approaching the problem is "not what I want to do".
>>
>> Anyway, the problem is the following:
>> I have a list of dictionaries, something like
>>
>> [ { "a": 1, "b": 1, "c": 3 },
>>   { "a": 1, "b": 1, "c": 4 },
>>   ...
>> ]
>>
>> and I'd like to iterate through all items with, e.g., "a":1. What I do
>> is sort and then groupby,
>>
>> my_list.sort( key=operator.itemgetter('a') )
>> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') )
>>
>> and then just very simply iterate over my_list_grouped,
>>
>> for my_item in my_list_grouped:
>>     # do something with my_item[0], my_item[1]
>>
>> Now, inside this loop I'd like to again iterate over all items with
>> the same 'b'-value -- no problem, just do the above inside the loop:
>>
>> for my_item in my_list_grouped:
>>         # group by keyword "b"
>>         my_list2 = list( my_item[1] )
>>         my_list2.sort( key=operator.itemgetter('b') )
>>         my_list_grouped = itertools.groupby( my_list2,
>> operator.itemgetter('b') )
>>         for e in my_list_grouped:
>>             # do something with e[0], e[1]
>>
>> That seems to work all right.
>>
>> Now, the problem occurs when this all is wrapped into an outer loop, such as
>>
>> for k in [ 'first pass', 'second pass' ]:
>>     for my_item in my_list_grouped:
>>     # bla, the above
>>
>> To be able to iterate more than once through my_list_grouped, I have
>> to convert it into a list first, so outside all loops, I go like
>>
>> my_list.sort( key=operator.itemgetter('a') )
>> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') )
>> my_list_grouped = list( my_list_grouped )
>>
>> This, however, makes it impossible to do the inner sort and
>> groupby-operation; you just get the very first element, and that's it.
>>
>> An example file is attached.
>>
>> Hints, anyone?
>>
>> Cheers,
>> Nico
>
> Does this example help at all?
>
> my_list.sort( key=itemgetter('a','b','c') )
> for a, a_iter in groupby(my_list, itemgetter('a')):
>    print 'New A', a
>    for b, b_iter in groupby(a_iter, itemgetter('b')):
>        print '\t', 'New B', b
>        for c, c_iter in groupby(b_iter, itemgetter('c')):
>            print '\t'*2, 'New C', c
>            for c_data in c_iter:
>                print '\t'*3, a, b, c, c_data
>            print '\t'*2, 'End C', c
>        print '\t', 'End B', b
>    print 'End A', a
>
> Jon.
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list