Combinations of lists

Oscar Benjamin oscar.j.benjamin at gmail.com
Wed Oct 3 15:20:40 EDT 2012


On 3 October 2012 15:26, Steen Lysgaard <boxeakasteen at gmail.com> wrote:
> Hi,
>
> I am looking for a clever way to compute all combinations of two lists. Look
> at this example:
>
> h = ['A','A','B','B']
> m = ['a','b']
>
> the resulting combinations should be of the same length as h and each
> element in m can be used twice. The sought after result using h and m from
> above is:
>
> [['aA', 'aA', 'bB', 'bB'],
>  ['aA', 'aB', 'bA', 'bB'],
>  ['aB', 'aB', 'bA', 'bA']]
>
> (the order of the results does not matter i.e. ['aA', 'aA', 'bB', 'bB'] and
> ['aA', 'bB', 'aA', 'bB'] are considered the same)
>
> This is achieved by the code below, this however needs to go through all
> possible combinations (faculty of len(h)) and rule out duplicates as they
> occur and this is too much if for example len(h) is 16.

I'm assuming that len(m) is always 2. Then if len(m) is 16 each
element of h can be used 8 times. If this is not as you intended you
will need to clarify how this problem generalises to other cases.

The elements that go with the 'b's are implicitly determined once you
have chosen the elements that go with the 'a's. The problem then is
solved if you choose the elements that go with the 'a's. If we need to
choose say k elements to go with the 'a's the basic problem becomes:
"enumerate over all multisets of size k that are subsets of the
multiset h."

'''
def submultisets(multiset, subsetsize, stack=None):
    # Enter recursion
    if stack is None:
        multiset = dict((c, multiset.count(c)) for c in set(multiset))
        stack = []

    c = next(iter(multiset))

    # End recursion
    if len(multiset) == 1:
        missing = subsetsize - len(stack)
        if multiset[c] >= missing:
            yield stack + missing  * [c]
        return

    # Continue recursion
    count = multiset.pop(c)
    for n in range(count + 1):
        stack.extend(n * c)
        for result in submultisets(multiset, subsetsize, stack):
            yield result
        del stack[-n:]
    multiset[c] = count

def uniquecombinations(h, m):
    for ha in submultisets(h, len(h)//2):
        hb = list(h)
        for c in ha:
            hb.remove(c)
        yield [m[0] + a for a in ha] + [m[1] + b for b in hb]

h = ['A', 'A', 'B', 'B']
m = ['a', 'b']

for x in uniquecombinations(h, m):
    print(x)
'''

Output:
['aB', 'aB', 'bA', 'bA']
['aA', 'aB', 'bA', 'bB']
['aA', 'aA', 'bB', 'bB']


Oscar



More information about the Python-list mailing list