Simple algorithm question - how to reorder a sequence economically

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri May 24 09:57:21 EDT 2013


On Fri, 24 May 2013 06:23:14 -0700, Peter Brooks wrote:


> Thanks for the warnings about random numbers too - I hope my lists will
> be short enough for the permutations of the function to be irrelevant. I
> don't need every single sequence to be unique, only that the same
> sequence only occurs occasionally. My lists might get up to the ~2k
> length one day, and I'll keep in mind the need, at that stage, to use a
> different pseudo-random generator. Actually, thinking about it, there is
> probably a source of non-algorithmically-derived 'random' numbers
> somewhere on the net that would do the job nicely.

That's massive overkill.

Take a closer look at what Ned wrote:

"The default random number generator has a period of 2**19937-1"

and consider the numbers.

For a list with 3000 items, there are 3000! possible permutations. That 
is approximately 10**21024. That is, a number with 21024 digits, or 
somewhere around a trillion trillion trillion ... trillion trillion, 
where there are 1752 "trillions".

If you could generate a million permutations a second, it would take you 
on average 10**210988 centuries before getting the original permutation 
again. That's the expected result you would get with a source of true 
randomness.

Instead, with Python's default pseudo-random number generator, you cannot 
get the full 3000! permutations, but only 2**19937-1 of them. Using the 
same calculation as above, that means that you will have to generate a 
million permutations per second for "only" 10**13783 centuries before 
getting the original permutation again.

There are uses where being able to generate any possible permutation is 
important, and the default PRNG is not sufficient. But merely shuffling 
your data around to avoid spurious correlations is not one of them. Save 
yourself a lot of development effort, and debugging, and just use 
random.shuffle.


-- 
Steven



More information about the Python-list mailing list