[Python-ideas] Extremely weird itertools.permutations
Steven D'Aprano
steve at pearwood.info
Sat Oct 12 09:35:31 CEST 2013
On Fri, Oct 11, 2013 at 10:37:02PM -0400, Neil Girdhar wrote:
> I think it's pretty indisputable that permutations are formally defined
> this way (and I challenge you to find a source that doesn't agree with
> that).
If by "this way" you mean "unique permutations only", then yes it
*completely* disputable, and I am doing so right now. I'm not arguing
one way or the other for a separate "unique_permutations" generator,
just that the existing permutations generator does the right thing. If
you're satisfied with that answer, you can stop reading now, because the
rest of my post is going to be rather long:
TL;DR:
If you want a unique_permutations generator, that's a reasonable
request. If you insist on changing permutations, that's unreasonable,
firstly because the current behaviour is correct, and secondly because
backwards compatibility would constrain it to keep the existing
behaviour even if it were wrong.
.
.
.
Still here? Okay then, let me justify why I say the current behaviour is
correct.
Speaking as a math tutor who has taught High School level combinatorics
for 20+ years, I've never come across any text book or source that
defines permutations in terms of unique permutations only. In every case
that I can remember, or that I still have access to, unique permutations
is considered a different kind of operation ("permutations ignoring
duplicates", if you like) rather than the default. E.g. "Modern
Mathematics 6" by Fitzpatrick and Galbraith has a separate section for
permutations with repetition, gives the example of taking permutations
from the word "MAMMAL", and explicitly contrasts situations where you
consider the three letters M as "different" from when you consider them
"the same". But in all such cases, such a situation is discussed as a
restriction on permutations, not an expansion, that is:
* there are permutations;
* sometimes you want to only consider unique permutations;
rather than:
* there are permutations, which are always unique;
* sometimes you want to consider things which are like permutations
except they're not necessarily unique.
I'd even turn this around and challenge you to find a source that *does*
define them as always unique. Here's a typical example, from the Collins
Dictionary of Mathematics:
[quote]
**permutation** or **ordered arrangement** n. 1 an ordered arrangement
of a specified number of objects selected from a set. The number of
distinct permutations of r objects from n is
n!/(n-r)!
usually written <subscript>n P <subscript>r or <superscript>n P
<subscript>r. For example there are six distinct permutations of two
objects selected out of three: <1,2>, <1,3>, <2,1>, <2,3>, <3,1>, <3,2>.
Compare COMBINATION.
2. any rearrangement of all the elements of a finite sequence, such as
(1,3,2) and (3,1,2). It is *odd* or *even* according as the number of
exchanges of position yielding it from the original order is odd or
even. It is a *cyclic permutation* if it merely advances all the
elements a fixed number of places; that is, if it is a CYCLE of maximal
LENGTH. A *transposition* is a cycle of degree two, and all permutations
factor as products of transpositions. See also SIGNATURE.
3. any BIJECTION of a set to itself, where the set may be finite or
infinite.
[end quote]
The definition makes no comment about how to handle duplicate elements,
but we can derive an answer for that:
1) We're told how many permutations there are. Picking r elements out of
n gives us n!/(n-r)!. If you throw away duplicate permutations, you will
fall short.
2) The number of permutations shouldn't depend on the specific
entities being permuted. Permutations of (1, 2, 3, 4) and (A, B, C, D)
should be identical. If your set of elements contains duplicates, such
as (Red ball, Red ball, Red ball, Black ball, Black ball), we can put
the balls into 1:1 correspondence with integers (1, 2, 3, 4, 5), permute
the integers, then reverse the mapping to get balls again. If we do
this, we ought to get the same result as just permuting the balls
directly.
(That's not to say that there are never cases where we don't care to
distinguish betweem one red ball and another. But in general we do
distinguish between them.)
I think this argument may hinge on what you consider *distinct*. In this
context, if I permute the string "RRRBB", I consider all three
characters to be distinct. Object identity is an implementation detail
(not all programming languages have "objects"); even equality is an
irrelevant detail. If I'm choosing to permute "RRRBB" rather than
"RB", then clearly *to me* there must be some distinguishing factor
between the three Rs and two Bs.
Another source is Wolfram Mathworld:
http://mathworld.wolfram.com/Permutation.html
which likewise says nothing about discarding repeated permutations when
there are repeated elements. See also their page on "Ball Picking":
http://mathworld.wolfram.com/BallPicking.html
Last but not least, here's a source which clearly distinguishes
permutations from "permutations with duplicates":
http://mathcentral.uregina.ca/QQ/database/QQ.09.07/h/beth3.html
and even gives a distinct formula for calculating the number of
permutations. Neither Wolfram Mathworld nor the Collins Dictionary of
Maths consider this formula important enough to mention, which suggests
strongly that it should be considered separate from the default
permutations.
(A little like cyclic permutations, which are different again.)
--
Steven
More information about the Python-ideas
mailing list