itertools: problem with nested groupby, list()
Nico Schlömer
nico.schloemer at gmail.com
Tue May 4 07:36:06 EDT 2010
> Does this example help at all?
Thanks, that clarified things a lot!
To make it easier, let's just look at 'a' and 'b':
> my_list.sort( key=itemgetter('a','b','c') )
> for a, a_iter in groupby(my_list, itemgetter('a')):
> print 'New A', a
> for b, b_iter in groupby(a_iter, itemgetter('b')):
> print '\t', 'New B', b
> for b_data in b_iter:
> print '\t'*3, a, b, b_data
> print '\t', 'End B', b
> print 'End A', a
That works well, and I can wrap the outer loop in another loop without
problems. What's *not* working, though, is having more than one pass
on the inner loop, as in
=============================== *snip* ===============================
my_list.sort( key=itemgetter('a','b','c') )
for a, a_iter in groupby(my_list, itemgetter('a')):
print 'New A', a
for pass in ['first pass', 'second pass']:
for b, b_iter in groupby(a_iter, itemgetter('b')):
print '\t', 'New B', b
for b_data in b_iter:
print '\t'*3, a, b, b_data
print '\t', 'End B', b
print 'End A', a
=============================== *snap* ===============================
I tried working around this by
=============================== *snip* ===============================
my_list.sort( key=itemgetter('a','b','c') )
for a, a_iter in groupby(my_list, itemgetter('a')):
print 'New A', a
inner_list = list( groupby(a_iter, itemgetter('b')) )
for pass in ['first pass', 'second pass']:
for b, b_iter in inner_list:
print '\t', 'New B', b
for b_data in b_iter:
print '\t'*3, a, b, b_data
print '\t', 'End B', b
print 'End A', a
=============================== *snap* ===============================
which don't work either, and I don't understand why. -- I'll look at
Uli's comments.
Cheers,
Nico
On Tue, May 4, 2010 at 1:08 PM, Jon Clements <joncle at googlemail.com> wrote:
> On 4 May, 11:10, Nico Schlömer <nico.schloe... at gmail.com> wrote:
>> Hi,
>>
>> I ran into a bit of an unexpected issue here with itertools, and I
>> need to say that I discovered itertools only recently, so maybe my way
>> of approaching the problem is "not what I want to do".
>>
>> Anyway, the problem is the following:
>> I have a list of dictionaries, something like
>>
>> [ { "a": 1, "b": 1, "c": 3 },
>> { "a": 1, "b": 1, "c": 4 },
>> ...
>> ]
>>
>> and I'd like to iterate through all items with, e.g., "a":1. What I do
>> is sort and then groupby,
>>
>> my_list.sort( key=operator.itemgetter('a') )
>> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') )
>>
>> and then just very simply iterate over my_list_grouped,
>>
>> for my_item in my_list_grouped:
>> # do something with my_item[0], my_item[1]
>>
>> Now, inside this loop I'd like to again iterate over all items with
>> the same 'b'-value -- no problem, just do the above inside the loop:
>>
>> for my_item in my_list_grouped:
>> # group by keyword "b"
>> my_list2 = list( my_item[1] )
>> my_list2.sort( key=operator.itemgetter('b') )
>> my_list_grouped = itertools.groupby( my_list2,
>> operator.itemgetter('b') )
>> for e in my_list_grouped:
>> # do something with e[0], e[1]
>>
>> That seems to work all right.
>>
>> Now, the problem occurs when this all is wrapped into an outer loop, such as
>>
>> for k in [ 'first pass', 'second pass' ]:
>> for my_item in my_list_grouped:
>> # bla, the above
>>
>> To be able to iterate more than once through my_list_grouped, I have
>> to convert it into a list first, so outside all loops, I go like
>>
>> my_list.sort( key=operator.itemgetter('a') )
>> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') )
>> my_list_grouped = list( my_list_grouped )
>>
>> This, however, makes it impossible to do the inner sort and
>> groupby-operation; you just get the very first element, and that's it.
>>
>> An example file is attached.
>>
>> Hints, anyone?
>>
>> Cheers,
>> Nico
>
> Does this example help at all?
>
> my_list.sort( key=itemgetter('a','b','c') )
> for a, a_iter in groupby(my_list, itemgetter('a')):
> print 'New A', a
> for b, b_iter in groupby(a_iter, itemgetter('b')):
> print '\t', 'New B', b
> for c, c_iter in groupby(b_iter, itemgetter('c')):
> print '\t'*2, 'New C', c
> for c_data in c_iter:
> print '\t'*3, a, b, c, c_data
> print '\t'*2, 'End C', c
> print '\t', 'End B', b
> print 'End A', a
>
> Jon.
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list