How to populate all possible hierarchical clusterings from a set of elements?

Arnaud Delobelle arnodel at gmail.com
Thu Jan 13 14:54:03 EST 2011


Peter Otten <__peter__ at web.de> writes:

> justin wrote:
>
>> The title sounds too complex, but my question is actually simple.
>> 
>> Suppose I have [1,2,3,4,5], then there are many ways of making
>> clustering.
>> Among them, I want to pair up terminals until there is only one left
>> at the end.
>> For example, ((((1,2),3),4),5), (1,(2,(3,(4,5)))), or (((1,2),(3,4)),
>> 5) would be legitimate ones.
>> 
>> How do you think can I, using the modules of Python such as itertools
>> as much as possible, make all possible such clusterings?
>
> Here's my first attempt:
>
> def cluster(items):
>     if len(items) == 2:
>         yield items
>         return
>
>     for i in range(len(items)-1):
>         for c in cluster(items[:i] + (items[i:i+2],) + items[i+2:]):
>             yield c
>
> def unique(items):
>     seen = set()
>     for item in items:
>         if item not in seen:
>             seen.add(item)
>             yield item
>
> if __name__ == "__main__":
>     for item in unique(cluster(tuple("abcd"))):
>         print item

more simply:

def clusters(l):
    if len(l) == 1:
        yield l[0]
        return
    for i in range(1, len(l)):
        for left in clusters(l[:i]):
            for right in clusters(l[i:]):
                yield (left, right)

That would give all solutions without duplicates.  In fact, this is
simply finding all full binary trees which order l.

However, somewhere else in the thread someone claims that no ordering is
implied on the initial list of items (which is not at all obvious from
the OP).

-- 
Arnaud



More information about the Python-list mailing list