Combinations of lists

Steen Lysgaard boxeakasteen at gmail.com
Wed Oct 3 16:15:05 EDT 2012


Hi,

thanks for your interest. Sorry for not being completely clear, yes
the length of m will always be half of the length of h.

/Steen

2012/10/3 Joshua Landau <joshua.landau.ws at gmail.com>:
> On 3 October 2012 20:20, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>>
>> On 3 October 2012 15:26, Steen Lysgaard <boxeakasteen at gmail.com> wrote:
>> > Hi,
>> >
>> > I am looking for a clever way to compute all combinations of two lists.
>> > Look
>> > at this example:
>> >
>> > h = ['A','A','B','B']
>> > m = ['a','b']
>> >
>> > the resulting combinations should be of the same length as h and each
>> > element in m can be used twice. The sought after result using h and m
>> > from
>> > above is:
>> >
>> > [['aA', 'aA', 'bB', 'bB'],
>> >  ['aA', 'aB', 'bA', 'bB'],
>> >  ['aB', 'aB', 'bA', 'bA']]
>> >
>> > (the order of the results does not matter i.e. ['aA', 'aA', 'bB', 'bB']
>> > and
>> > ['aA', 'bB', 'aA', 'bB'] are considered the same)
>> >
>> > This is achieved by the code below, this however needs to go through all
>> > possible combinations (faculty of len(h)) and rule out duplicates as
>> > they
>> > occur and this is too much if for example len(h) is 16.
>>
>> I'm assuming that len(m) is always 2. Then if len(m) is 16 each
>> element of h can be used 8 times. If this is not as you intended you
>> will need to clarify how this problem generalises to other cases.
>
>
> His code only works when len(m) is half of len(h), so this may not be the
> right assumption.
>
>>
>> The elements that go with the 'b's are implicitly determined once you
>> have chosen the elements that go with the 'a's. The problem then is
>> solved if you choose the elements that go with the 'a's. If we need to
>> choose say k elements to go with the 'a's the basic problem becomes:
>> "enumerate over all multisets of size k that are subsets of the
>> multiset h."
>>
>> '''
>> def submultisets(multiset, subsetsize, stack=None):
>>     # Enter recursion
>>     if stack is None:
>>         multiset = dict((c, multiset.count(c)) for c in set(multiset))
>>         stack = []
>>
>>     c = next(iter(multiset))
>>
>>     # End recursion
>>     if len(multiset) == 1:
>>         missing = subsetsize - len(stack)
>>         if multiset[c] >= missing:
>>             yield stack + missing  * [c]
>>         return
>>
>>     # Continue recursion
>>     count = multiset.pop(c)
>>     for n in range(count + 1):
>>         stack.extend(n * c)
>>         for result in submultisets(multiset, subsetsize, stack):
>>             yield result
>>         del stack[-n:]
>>     multiset[c] = count
>>
>> def uniquecombinations(h, m):
>>     for ha in submultisets(h, len(h)//2):
>>         hb = list(h)
>>         for c in ha:
>>             hb.remove(c)
>>         yield [m[0] + a for a in ha] + [m[1] + b for b in hb]
>>
>> h = ['A', 'A', 'B', 'B']
>> m = ['a', 'b']
>>
>> for x in uniquecombinations(h, m):
>>     print(x)
>> '''
>>
>> Output:
>> ['aB', 'aB', 'bA', 'bA']
>> ['aA', 'aB', 'bA', 'bB']
>> ['aA', 'aA', 'bB', 'bB']



More information about the Python-list mailing list