how to group by function if one of the group has relationship with another one in the group?

Tue Jul 25 04:59:43 EDT 2017

Ho Yeung Lee wrote:

> from itertools import groupby
> 
> testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)]
> def isneighborlocation(lo1, lo2):
>     if abs(lo1[0] - lo2[0]) == 1  or lo1[1] == lo2[1]:
>         return 1
>     elif abs(lo1[1] - lo2[1]) == 1  or lo1[0] == lo2[0]:
>         return 1
>     else:
>         return 0
> 
> groupda = groupby(testing1, isneighborlocation)
> for key, group1 in groupda:
>     print key
>     for thing in group1:
>         print thing
> 
> expect output 3 group
> group1 [(1,1)]
> group2 [(2,3),(2,4]
> group3 [(3,5),(3,6),(4,6)]

groupby() calculates the key value from the current item only, so there's no 
"natural" way to apply it to your problem.

Possible workarounds are to feed it pairs of neighbouring items (think 
zip()) or a stateful key function. Below is an example of the latter:

$ cat sequential_group_class.py
from itertools import groupby

missing = object()

class PairKey:
    def __init__(self, continued):
        self.prev = missing
        self.continued = continued
        self.key = False

    def __call__(self, item):
        if self.prev is not missing and not self.continued(self.prev, item):
            self.key = not self.key
        self.prev = item
        return self.key

def isneighborlocation(lo1, lo2):
    x1, y1 = lo1
    x2, y2 = lo2
    dx = x1 - x2
    dy = y1 - y2
    return dx*dx + dy*dy <= 1

items = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)]

for key, group in groupby(items, key=PairKey(isneighborlocation)):
    print key, list(group)

$ python sequential_group_class.py 
False [(1, 1)]
True [(2, 3), (2, 4)]
False [(3, 5), (3, 6), (4, 6)]