How to populate all possible hierarchical clusterings from a set of elements?

Peter Otten __peter__ at web.de
Wed Jan 12 15:43:22 EST 2011


justin wrote:

> The title sounds too complex, but my question is actually simple.
> 
> Suppose I have [1,2,3,4,5], then there are many ways of making
> clustering.
> Among them, I want to pair up terminals until there is only one left
> at the end.
> For example, ((((1,2),3),4),5), (1,(2,(3,(4,5)))), or (((1,2),(3,4)),
> 5) would be legitimate ones.
> 
> How do you think can I, using the modules of Python such as itertools
> as much as possible, make all possible such clusterings?

Here's my first attempt:

def cluster(items):
    if len(items) == 2:
        yield items
        return

    for i in range(len(items)-1):
        for c in cluster(items[:i] + (items[i:i+2],) + items[i+2:]):
            yield c

def unique(items):
    seen = set()
    for item in items:
        if item not in seen:
            seen.add(item)
            yield item

if __name__ == "__main__":
    for item in unique(cluster(tuple("abcd"))):
        print item

Unfortunately I get a lot of duplicates :(

You could define a kind of operator precedence using 
itertools.combinations(), but I think it suffers from the same problem as

a 3 b 1 c 2 d

and

a 2 b 1 c 3 d

would both result in ((a, b), (c, d)).






More information about the Python-list mailing list