[Python-ideas] Extremely weird itertools.permutations

Sat Oct 12 09:35:31 CEST 2013

On Fri, Oct 11, 2013 at 10:37:02PM -0400, Neil Girdhar wrote:
> I think it's pretty indisputable that permutations are formally defined
> this way (and I challenge you to find a source that doesn't agree with
> that).

If by "this way" you mean "unique permutations only", then yes it 
*completely* disputable, and I am doing so right now. I'm not arguing 
one way or the other for a separate "unique_permutations" generator, 
just that the existing permutations generator does the right thing. If 
you're satisfied with that answer, you can stop reading now, because the 
rest of my post is going to be rather long:

TL;DR:

If you want a unique_permutations generator, that's a reasonable 
request. If you insist on changing permutations, that's unreasonable, 
firstly because the current behaviour is correct, and secondly because 
backwards compatibility would constrain it to keep the existing 
behaviour even if it were wrong.

.
.
.

Still here? Okay then, let me justify why I say the current behaviour is 
correct.

Speaking as a math tutor who has taught High School level combinatorics 
for 20+ years, I've never come across any text book or source that 
defines permutations in terms of unique permutations only. In every case 
that I can remember, or that I still have access to, unique permutations 
is considered a different kind of operation ("permutations ignoring 
duplicates", if you like) rather than the default. E.g. "Modern 
Mathematics 6" by Fitzpatrick and Galbraith has a separate section for 
permutations with repetition, gives the example of taking permutations 
from the word "MAMMAL", and explicitly contrasts situations where you 
consider the three letters M as "different" from when you consider them 
"the same". But in all such cases, such a situation is discussed as a 
restriction on permutations, not an expansion, that is:

* there are permutations;
* sometimes you want to only consider unique permutations;

rather than:

* there are permutations, which are always unique;
* sometimes you want to consider things which are like permutations 
  except they're not necessarily unique.

I'd even turn this around and challenge you to find a source that *does* 
define them as always unique. Here's a typical example, from the Collins 
Dictionary of Mathematics:

[quote]
**permutation** or **ordered arrangement** n. 1 an ordered arrangement 
of a specified number of objects selected from a set. The number of 
distinct permutations of r objects from n is

    n!/(n-r)!

usually written <subscript>n P <subscript>r or <superscript>n P 
<subscript>r. For example there are six distinct permutations of two 
objects selected out of three: <1,2>, <1,3>, <2,1>, <2,3>, <3,1>, <3,2>. 
Compare COMBINATION.

2. any rearrangement of all the elements of a finite sequence, such as 
(1,3,2) and (3,1,2). It is *odd* or *even* according as the number of 
exchanges of position yielding it from the original order is odd or 
even. It is a *cyclic permutation* if it merely advances all the 
elements a fixed number of places; that is, if it is a CYCLE of maximal 
LENGTH. A *transposition* is a cycle of degree two, and all permutations 
factor as products of transpositions. See also SIGNATURE.

3. any BIJECTION of a set to itself, where the set may be finite or 
infinite.
[end quote]

The definition makes no comment about how to handle duplicate elements, 
but we can derive an answer for that:

1) We're told how many permutations there are. Picking r elements out of 
n gives us n!/(n-r)!. If you throw away duplicate permutations, you will 
fall short.

2) The number of permutations shouldn't depend on the specific 
entities being permuted. Permutations of (1, 2, 3, 4) and (A, B, C, D) 
should be identical. If your set of elements contains duplicates, such 
as (Red ball, Red ball, Red ball, Black ball, Black ball), we can put 
the balls into 1:1 correspondence with integers (1, 2, 3, 4, 5), permute 
the integers, then reverse the mapping to get balls again. If we do 
this, we ought to get the same result as just permuting the balls 
directly.

(That's not to say that there are never cases where we don't care to 
distinguish betweem one red ball and another. But in general we do 
distinguish between them.)

I think this argument may hinge on what you consider *distinct*. In this 
context, if I permute the string "RRRBB", I consider all three 
characters to be distinct. Object identity is an implementation detail 
(not all programming languages have "objects"); even equality is an 
irrelevant detail. If I'm choosing to permute "RRRBB" rather than 
"RB", then clearly *to me* there must be some distinguishing factor 
between the three Rs and two Bs.

Another source is Wolfram Mathworld:

http://mathworld.wolfram.com/Permutation.html

which likewise says nothing about discarding repeated permutations when 
there are repeated elements. See also their page on "Ball Picking":

http://mathworld.wolfram.com/BallPicking.html

Last but not least, here's a source which clearly distinguishes 
permutations from "permutations with duplicates":

http://mathcentral.uregina.ca/QQ/database/QQ.09.07/h/beth3.html

and even gives a distinct formula for calculating the number of 
permutations. Neither Wolfram Mathworld nor the Collins Dictionary of 
Maths consider this formula important enough to mention, which suggests 
strongly that it should be considered separate from the default 
permutations.

(A little like cyclic permutations, which are different again.)

-- 
Steven