How to group list of list with condition in Python

Gundala Viswanath gundalav at gmail.com
Tue Nov 18 02:37:21 EST 2014


I have the following list of lists that contains 6 entries:

lol = [['a', 3, 1.01], ['x',5, 1.00],['k',7, 2.02],['p',8, 3.00],
['b', 10, 1.09],
       ['f', 12, 2.03]]

each list in lol contain 3 elements:

['a', 3, 1.01]
  e1  e2   e3

The list above is already sorted according to the e2 (i.e, 2nd element)

I'd like to 'cluster' the above list following roughly these steps:

1. Pick the lowest entry (wrt. e2) in lol as the key of first cluster
2. Assign that as first member of the cluster (dictionary of list)
3. Calculate the difference current e3 in next list with first member
of existing clusters.
3. If the difference is less than threshold, assign that list as the
member of the corresponding cluster Else, create new cluster with
current list as new key.
3. Repeat the rest until finish

The final result will look like this, with threshold <= 0.1.

dol = {'a':['a','x','b'], 'k':['k','f'], 'p':['p']}

I'm stuck with this step what's the right way to do it:

__BEGIN__
import json
from collections import defaultdict

thres = 0.1
tmp_e3 = 0
tmp_e1 = "-"

lol = [['a', 3, 1.01], ['x',5, 1.00],['k',7, 2.02],
       ['p',8, 3.00], ['b', 10, 1.09], ['f', 12, 2.03]]

dol = defaultdict(list)
for thelist in lol:
    e1, e2, e3 = thelist

    if tmp_e1 == "-":
        tmp_e1 = e1
    else:
        diff = abs(tmp_e3 - e3)
        if diff > thres:
            tmp_e1 = e1

    dol[tmp_e1].append(e1)
    tmp_e1 = e1
    tmp_e3 = e3

print json.dumps(dol, indent=4)
__END__

Best,
G.v.



More information about the Python-list mailing list