Implementaion of random.shuffle
Josiah Carlson
josiah.carlson at sbcglobal.net
Tue Jul 17 12:13:25 EDT 2007
shabda raaj wrote:
> The code for shuffle is
>
> if random is None:
> random = self.random
> for i in reversed(xrange(1, len(x))):
> # pick an element in x[:i+1] with which to exchange x[i]
> j = int(random() * (i+1))
> x[i], x[j] = x[j], x[i]
>
> With
> j = int(random() * (i+1))
> being the only call to random().
> As long as (i + 1) is not large enough so that j is generated with
> equal probability in range [0, i] we would get all permutations with
> equal probability. With MT with period of 2**19937 (i+1) would have to
> be really large before this the case.
Be careful of assuming too much about permutations and randomness. In
particular, you are assuming too much about what mersenne twister
guarantees, what that implies regarding random permutations, and what a
random permutation really is.
In particular, if we merely enumerate all possible permutations of n
elements, there are n! total permutations. Now, because the generation
of a permutation *is fundamentally indistinguishable from choosing a
number in the range [0...n!)*, then we can rely on the knowledge and
research regarding the generation of such numbers. In particular,
thanks to birthday collisions, you only need to generate O(sqrt(n!))
random permutations to have a 50% chance of having come upon a duplicate
permutation. For example:
>>> a = range(10)
>>> b = {}
>>> import random
>>> while 1:
... x = a[:]
... random.shuffle(x)
... x = tuple(x)
... if x not in b:
... b[x] = None
... else:
... break
...
>>> len(b)
3605
>>> x
(4, 3, 9, 8, 0, 2, 1, 7, 6, 5)
>>> x in b
True
>>>
I only generated 3605 permutations of 10 items before I got a collision.
According to your claims, I should have gone through all 3628800
possible permutations before that happened. Note that I could repeat
this exercise with any decent pseudo random number generator, or actual
random information generated from any good random source. As long as
the PRNG doesn't suck, it should offer some variant of the above.
> Anyway, is there some test harness we can run to test the robustness
> of shuffle? We can run that test harness for large values and see at
> what point all permutations are not possible or come with unequal
> probability.
Shuffle works as intended. Your assumptions about it's functionality
are incorrect. Testing will only prove that your assumptions are wrong.
- Josiah
More information about the Python-list
mailing list