[Python-ideas] About adding a new iteratormethodcalled "shuffled"

Stephen J. Turnbull stephen at xemacs.org
Thu Mar 26 11:31:16 CET 2009


Adam Olsen writes:

 > If random.shuffle() is broken for lists more than 2080 then it should
 > raise an error.

Not really.  Assuming the initial state is drawn from a uniform
distribution on all possible states, if all 2080-shuffles are
equiprobable, then any statistic you care to calculate based on that
will come out the same as if you had 2080 statistically independent
draws without replacement.  Another way to put it is "if you need a
random shuffle, this one is good enough for *any* such purpose".

However, once you exceed that limit you have to ask whether it's good
enough for the purpose at hand.  For some purposes, the distribution
of (2**19937-1)-shuffles might be good enough, even though they make
up only 1/(2**19937-2) of the possible shuffles.  (Yeah, I know, you
can wait....)

 > Claiming it "might" be broken in the docs for moderately sized
 > lists, without researching such a claim, is pointless fear
 > mongering.

How about if it's phrased the way I did above?  Ie, this is good
enough for any N-shuffle for *any purpose whatsoever*, for N < 2081.



More information about the Python-ideas mailing list