Sequence splitting

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Jul 3 05:50:09 EDT 2009


On Fri, 03 Jul 2009 01:39:27 -0700, Paul Rubin wrote:

> Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:
>> groupby() works on lists.
> 
>>>> a = [1,3,4,6,7]
>>>> from itertools import groupby
>>>> b = groupby(a, lambda x: x%2==1)  # split into even and odd 
>>>> c = list(b)
>>>> print len(c)
> 3
>>>> d = list(c[1][1])    # should be [4,6] print d  # oops.
> []

I didn't say it worked properly *wink*

Seriously, this behaviour caught me out too. The problem isn't that the 
input data is a list, the same problem occurs for arbitrary iterators. 
>From the docs:

[quote]
The operation of groupby() is similar to the uniq filter in Unix. It 
generates a break or new group every time the value of the key function 
changes (which is why it is usually necessary to have sorted the data 
using the same key function). That behavior differs from SQL’s GROUP BY 
which aggregates common elements regardless of their input order.

The returned group is itself an iterator that shares the underlying 
iterable with groupby(). Because the source is shared, when the groupby() 
object is advanced, the previous group is no longer visible. So, if that 
data is needed later, it should be stored as a list
[end quote]

http://www.python.org/doc/2.6/library/itertools.html#itertools.groupby




-- 
Steven



More information about the Python-list mailing list