how to group by function if one of the group has relationship with another one in the group?

Sun Jul 30 15:24:29 EDT 2017

which function should be used for this problem?

On Saturday, July 29, 2017 at 11:02:30 PM UTC+8, Piet van Oostrum wrote:
> Peter Otten <__peter__ at web.de> writes:
> 
> > Ho Yeung Lee wrote:
> >
> >> from itertools import groupby
> >> 
> >> testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)]
> >> def isneighborlocation(lo1, lo2):
> >>     if abs(lo1[0] - lo2[0]) == 1  or lo1[1] == lo2[1]:
> >>         return 1
> >>     elif abs(lo1[1] - lo2[1]) == 1  or lo1[0] == lo2[0]:
> >>         return 1
> >>     else:
> >>         return 0
> >> 
> >> groupda = groupby(testing1, isneighborlocation)
> >> for key, group1 in groupda:
> >>     print key
> >>     for thing in group1:
> >>         print thing
> >> 
> >> expect output 3 group
> >> group1 [(1,1)]
> >> group2 [(2,3),(2,4]
> >> group3 [(3,5),(3,6),(4,6)]
> >
> > groupby() calculates the key value from the current item only, so there's no 
> > "natural" way to apply it to your problem.
> >
> > Possible workarounds are to feed it pairs of neighbouring items (think 
> > zip()) or a stateful key function. Below is an example of the latter:
> >
> > $ cat sequential_group_class.py
> > from itertools import groupby
> >
> > missing = object()
> >
> > class PairKey:
> >     def __init__(self, continued):
> >         self.prev = missing
> >         self.continued = continued
> >         self.key = False
> >
> >     def __call__(self, item):
> >         if self.prev is not missing and not self.continued(self.prev, item):
> >             self.key = not self.key
> >         self.prev = item
> >         return self.key
> >
> > def isneighborlocation(lo1, lo2):
> >     x1, y1 = lo1
> >     x2, y2 = lo2
> >     dx = x1 - x2
> >     dy = y1 - y2
> >     return dx*dx + dy*dy <= 1
> >
> > items = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)]
> >
> > for key, group in groupby(items, key=PairKey(isneighborlocation)):
> >     print key, list(group)
> >
> > $ python sequential_group_class.py 
> > False [(1, 1)]
> > True [(2, 3), (2, 4)]
> > False [(3, 5), (3, 6), (4, 6)]
> 
> That only works if
> (a) The elements in the list are already clustered on group (i.e. all
> elements of a group are adjacent)
> (b) In a group the order is such that adjacent elements are direct
> neigbours, i.e. their distance is at most 1.
> 
> So 'groupby' is not a natural solution for this problem.
> -- 
> Piet van Oostrum <piet-l at vanoostrum.org>
> WWW: http://piet.vanoostrum.org/
> PGP key: [8DAE142BE17999C4]