Clustering the keys of a dict according to its values

Florian Brucker torf at torfbold.com
Fri Nov 14 08:16:11 EST 2008


Hi everybody!

Given a dictionary, I want to create a clustered version of it, 
collecting keys that have the same value:

 >>> d = {'a':1, 'b':2, 'c':1, 'd':1, 'e':2, 'f':3}
 >>> cluster(d)
{1:['a', 'c', 'd'], 2:['b', 'e'], 3:['f']}

That is, generate a new dict which holds for each value of the old dict 
a list of the keys of the old dict that have that very value.

Another requirement is that it should also work on lists, in that case 
with indices instead of keys. We may assume that all values in the 
original dict/list can be used as dict keys.

Right now I'm doing it like this:

   def cluster(d):
     try:
       # is d a dict?
       values = unique(d.values())
       keys = d.keys()
     except AttributeError:
       # assume d is a list
       values = unique(d)
       keys = range(len(d))

     clusters = {}
     for value in values:
       clusters[value] = filter(lambda v: d[v] == value, keys)

     return clusters

where unique() is from O'Reilly's Python Cookbook and returns a list 
containing each item of the given list exactly once.

Now I'm pretty new to Python and chances are rather high that this could 
be done prettier and/or more efficient. Therefore any comment on the 
above code is greatly appreciated.


Regards,
Florian



More information about the Python-list mailing list