how to sort a list of tuples with custom function

Wed Aug 2 03:05:56 EDT 2017

Glenn Linderman wrote:

> On 8/1/2017 2:10 PM, Piet van Oostrum wrote:
>> Ho Yeung Lee <jobmattcon at gmail.com> writes:
>>
>>> def isneighborlocation(lo1, lo2):
>>>      if abs(lo1[0] - lo2[0]) < 7  and abs(lo1[1] - lo2[1]) < 7:
>>>          return 1
>>>      elif abs(lo1[0] - lo2[0]) == 1  and lo1[1] == lo2[1]:
>>>          return 1
>>>      elif abs(lo1[1] - lo2[1]) == 1  and lo1[0] == lo2[0]:
>>>          return 1
>>>      else:
>>>          return 0
>>>
>>>
>>> sorted(testing1, key=lambda x: (isneighborlocation.get(x[0]), x[1]))
>>>
>>> return something like
>>> [(1,2),(3,3),(2,5)]

>> I think you are trying to sort a list of two-dimensional points into a
>> one-dimensiqonal list in such a way thet points that are close together
>> in the two-dimensional sense will also be close together in the
>> one-dimensional list. But that is impossible.

> It's not impossible, it just requires an appropriate distance function
> used in the sort.

That's a grossly misleading addition. 

Once you have an appropriate clustering algorithm

clusters = split_into_clusters(items) # needs access to all items

you can devise a key function

def get_cluster(item, clusters=split_into_clusters(items)):
    return next(
        index for index, cluster in enumerate(clusters) if item in cluster
    )

such that

grouped_items = sorted(items, key=get_cluster)

but that's a roundabout way to write

grouped_items = sum(split_into_clusters(items), [])

In other words: sorting is useless, what you really need is a suitable 
approach to split the data into groups. 

One well-known algorithm is k-means clustering:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.kmeans.html

Here is an example with pictures:

https://dzone.com/articles/k-means-clustering-scipy