[Python-ideas] Fwd: Extremely weird itertools.permutations

Sun Oct 13 21:02:56 CEST 2013

[MRAB, posts a beautiful solution]

I don't really have a use for this, but it was a lovely programming
puzzle, so I'll include an elaborate elaboration of MRAB's algorithm
below.  And that's the end of my interest in this ;-)

It doesn't require that elements be orderable or even hashable.  It
does require that they can be compared for equality, but it's pretty
clear that if we _do_ include something like this, "equality" has to
be pluggable.  By default, this uses `operator.__eq__`, but any
2-argument function can be used.  E.g., use `operator.is_` to make it
believe that only identical objects are equal.  Or pass a lambda to
distinguish by type too (e.g., if you don't want 3 and 3.0 to be
considered equal).  Etc.

The code is much lower-level, to make it closer to an efficient C
implementation.  No dicts, no sets, no slicing or concatenation of
lists, etc.  It sticks to using little integers (indices) as far as
possible, which can be native in C (avoiding mounds of increfs and
decrefs).

Also, because "equality" is pluggable, it may be a slow operation.
The `equal()` function is only called here during initial setup, to
partition the elements into equivalence classes.  Where N is
len(iterables), at best `equal()` is called N-1 times (if all elements
happen to be equal), and at worst N*(N-1)/2 times (if no elements
happen to be equal), all independent of `count`.  It assumes `equal()`
is transitive.

It doesn't always return permutations in the same order as MRAB's
function, because - to avoid any searching - it iterates over
equivalence classes instead of over the original iterables.  This is
the simplest differing example I can think of:

>>> list(unique_permutations("aba", 2))
[('a', 'b'), ('a', 'a'), ('b', 'a')]

For the first result, MRAB's function first picks the first 'a', then
removes it from the iterables and recurses on ("ba", 1).  So it finds
'b' next, and yields ('a', 'b') (note:  this is the modified
unique_permutations() below - MRAB's original actually yielded lists,
not tuples).

But:

>>> list(up("aba", 2))
[('a', 'a'), ('a', 'b'), ('b', 'a')]

Different order!  That's because "up" is iterating over (conceptually)

    [EquivClass(first 'a', second 'a'), EquivClass('b')]

It first picks the first `a`, then adjusts list pointers (always a
fast, constant-time operation) so that it recurses on

    [EquivClass(second 'a'), EquivClass('b')]

So it next finds the second 'a', and yields (first 'a', second 'a') as
its first result.  Maybe this will make it clearer:

>>> list(up(["a1", "b", "a2"], 2, lambda x, y: x[0]==y[0]))
[('a1', 'a2'), ('a1', 'b'), ('b', 'a1')]

No, I guess that didn't make it clearer - LOL ;-)  Do I care?  No.

Anyway, here's the code.  Have fun :-)

# MRAB's beautiful solution, modified in two ways to be
# more like itertools.permutations:
# 1. Yield tuples instead of lists.
# 2. When count > len(iterable), don't yield anything.

def unique_permutations(iterable, count=None):
    def perm(items, count):
        if count:
            seen = set()
            for i, item in enumerate(items):
                if item not in seen:
                    for p in perm(items[:i] + items[i+1:], count - 1):
                        yield [item] + p
                    seen.add(item)
        else:
            yield []

    items = list(iterable)
    if count is None:
        count = len(items)
    if count > len(items):
        return
    for p in perm(items, count):
        yield tuple(p)

# New code, ending in generator `up()`.
import operator

# In C, this would be a struct of native C types,
# and the brief methods would be coded inline.
class ENode:
    def __init__(self, initial_index=None):
        self.indices = [initial_index] # list of equivalent indices
        self.current = 0
        self.prev = self.next = self

    def index(self):
        "Return current index."
        return self.indices[self.current]

    def unlink(self):
        "Remove self from list."
        self.prev.next = self.next
        self.next.prev = self.prev

    def insert_after(self, x):
        "Insert node x after self."
        x.prev = self
        x.next = self.next
        self.next.prev = x
        self.next = x

    def advance(self):
        """Advance the current index.

        If we're already at the end, remove self from list.

        .restore() undoes everything .advance() did."""

        assert self.current < len(self.indices)
        self.current += 1
        if self.current == len(self.indices):
            self.unlink()

    def restore(self):
        "Undo what .advance() did."
        assert self.current <= len(self.indices)
        if self.current == len(self.indices):
            self.prev.insert_after(self)
        self.current -= 1

def build_equivalence_classes(items, equal):
    ehead = ENode() # headed, doubly-linked circular list of equiv classes
    for i, elt in enumerate(items):
        e = ehead.next
        while e is not ehead:
            if equal(elt, items[e.indices[0]]):
                # Add (index of) elt to this equivalence class.
                e.indices.append(i)
                break
            e = e.next
        else:
            # elt not equal to anything seen so far:  append
            # new equivalence class.
            e = ENode(i)
            ehead.prev.insert_after(e)
    return ehead

def up(iterable, count=None, equal=operator.__eq__):
    def perm(i):
        if i:
            e = ehead.next
            assert e is not ehead
            while e is not ehead:
                result[count - i] = e.index()
                e.advance()
                yield from perm(i-1)
                e.restore()
                e = e.next
        else:
            yield tuple(items[j] for j in result)

    items = tuple(iterable)
    if count is None:
        count = len(items)
    if count > len(items):
        return

    ehead = build_equivalence_classes(items, equal)
    result = [None] * count
    yield from perm(count)