Combinations of lists

Joshua Landau joshua.landau.ws at gmail.com
Sat Oct 6 11:31:16 EDT 2012


On 4 October 2012 16:12, Steen Lysgaard <boxeakasteen at gmail.com> wrote:

> 2012/10/4 Joshua Landau <joshua.landau.ws at gmail.com>:
> > On 3 October 2012 21:15, Steen Lysgaard <boxeakasteen at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> thanks for your interest. Sorry for not being completely clear, yes
> >> the length of m will always be half of the length of h.
> >
> >
> > (Please don't top post)
> >
> > I have a solution to this, then.
> > It's not short or fast, but it's a lot faster than yours.
>

<snip>

 > This is quite naive, because I don't know how to properly implement
> > force_unique_combinations, but it runs. I hope this is right. If you need
> > significantly more speed your best chance is probably Cython or C,
> although
> > I don't doubt 10x more speed may well be possible from within Python.
> >
> >
> > Also, 88888 Dihedral is a bot, or at least pretending like crazy to be
> one.
>
> Great, I've now got a solution much faster than what I could come up with.
> Thanks to the both of you.
>

Don't use it though :P. I've something better, now I've used a few
sanity-points up [it's much more interesting to solve *other* people's
problems].

Please note that my implementations (old and new) return duplicates when
the second list contains duplicates. It's fixable, but I'll only bother if
you need it fixed.

It runs in a very consistent 55% of the time, but it is longer. Here y'are.

"""Super algorithm."""
>
> from itertools import combinations
> from collections import Counter
>
> def multiples(counter):
> """Counter -> set.
>
> Returns the set of values that occur multiple times.
>  """
> multiples = set()
>
> for item, number in counter.items():
>  if number > 1:
> multiples.add(item)
>
> return multiples
>
> #@profile
> def pairwise_combinations(counter, countermultiples, counterset, length,
> charmap):
> # length is a LIE!
> """Get the permutations of two lists.
>
> Do not call this directly unless you want to hassle yourself.
> Use the wrapper provided, list_permute, instead.
>  """
>
> # We will do the combinations in pairs, as each pair will not have order
> and so
>  # [1, 2, 3, 4] is equivilant to [2, 1, 4, 3] but not [1, 3, 2, 4].
>
> # This means that we get the full permutations without ever filtering.
>
> # Each pair along is a combination.
> # We are combination-ing a set to prevent dulicates.
>  # As the combinations are of length 2, the only ones this will
> # miss are of the type [x, x] (two of the same).
>  # This is accounted for by countermultiples.
> pairs = combinations(counterset, 2)
>
> # Prepend with this
>  length -= 1
> prefix_char = charmap[length]
>
> # There's not reason to recurse, so don't bother with a lot of stuff
>  if not length:
> for p_a, p_b in pairs:
> yield [prefix_char+p_a, prefix_char+p_b]
>  for p in countermultiples:
> yield [prefix_char+p, prefix_char+p]
> return
>
> for p_a, p_b in pairs:
> # Edit scope
> # The recursion wont be able to use items we've already used
>  counter[p_a] -= 1
> counter_p_a = counter[p_a] # Quickref
> if   counter_p_a == 0: counterset.remove(p_a) # None left
>  elif counter_p_a == 1: countermultiples.remove(p_a) # Not plural
>
> counter[p_b] -= 1
> counter_p_b = counter[p_b] # Quickref
>  if   counter_p_b == 0: counterset.remove(p_b) # None left
> elif counter_p_b == 1: countermultiples.remove(p_b) # Not plural
>
> # Recurse
> # Do the same, but for the next pair along
> own_yield = [prefix_char+p_a, prefix_char+p_b]
>  for delegated_yield in pairwise_combinations(counter, countermultiples,
> counterset, length, charmap):
> yield own_yield + delegated_yield
>
> # Reset scope
> counter[p_a] += 1
> if   counter_p_a == 0: counterset.add(p_a)
>  elif counter_p_a == 1: countermultiples.add(p_a)
>
> counter[p_b] += 1
> if   counter_p_b == 0: counterset.add(p_b)
>  elif counter_p_b == 1: countermultiples.add(p_b)
>
>
> # Now do the same for countermultiples
>  # This is not itertools.chain'd because this way
> # is faster and I get to micro-optomize inside
>  for p in countermultiples:
> # Edit scope
> # The recursion wont be able to use items we've already used
>  counter[p] -= 2
> counter_p = counter[p] # Quickref
>
> if   counter_p == 0:
>  counterset.remove(p) # None left
> countermultiples.remove(p) # Must have been in countermultiples, none left
>
> elif counter_p == 1:
> countermultiples.remove(p) # Not plural
>
> # Recurse
>  # Do the same, but for the next pair along
> own_yield = [prefix_char+p, prefix_char+p]
> for delegated_yield in pairwise_combinations(counter, countermultiples,
> counterset, length, charmap):
>  yield own_yield + delegated_yield
>
> # Reset scope
> counter[p] += 2
>
> if   counter_p == 0:
> counterset.add(p)
> countermultiples.add(p)
>
> elif counter_p == 1:
> countermultiples.add(p)
>
> def list_permute(first, second):
> """Get the permutations of two lists as according to what you want, which
> isn't really the
>  permutations of two lists but something close to it. It does what it
> needs to, I think.
>
> This DOES NOT work when <second> contains duplicates, as the result has
> duplicates. The other
>  of mine does not work either. If this is a problem, it should be
> fixable: sort <second>
> and when you encounter the duplicates generate in groups bigger than 2.
> You cannot do as above
>  for pairs, so an intermittent filtering method will work best. I won't
> implement this if it's
> unneeded, though.
>
> This runs in 55% of the time of my previous one, with over twice the
> number of lines.
> W007! Lines!
>  """
> count = Counter(first)
> return pairwise_combinations(count, multiples(count), set(count),
> len(first)//2, list(reversed(second)))
>
> # TEST #
>
> second = "abcde"
> first = sorted((second+second).upper())
>
> n = 0
> for _ in list_permute(first, second): n += 1
> print(n)


I release this under whatever permissive licence you want.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20121006/7023af13/attachment.html>


More information about the Python-list mailing list