random shuffles

danielx danielwong at berkeley.edu
Sat Jul 22 04:43:30 EDT 2006


Boris Borcic wrote:
> does
>
> x.sort(cmp = lambda x,y : cmp(random.random(),0.5))
>
> pick a random shuffle of x with uniform distribution ?
>
> Intuitively, assuming list.sort() does a minimal number of comparisons to
> achieve the sort, I'd say the answer is yes. But I don't feel quite confortable
> with the intuition... can anyone think of a more solid argumentation ?
>
> - BB
> --
> "On naît tous les mètres du même monde"

Someone has already said this, but I'm not sure we can ignore exactly
what algorithm we are using. With that in mind, I'll just arbitrarily
choose an algorithm to use for analysis. I know you want something that
works in N*log(N) where N is the length of the list, but I'm going to
ignore that and consider selection sort for the sake of a "more solid
argument".

In that case, you would NOT achieve a "uniform" distribution. I quote
that because I'm going to make up a definition, which I hope
corresponds to the official one ;). To me, uniform will mean if we look
at any position in the list, element e has 1/N chance of being there.

Let e be the element which was in the first position to begin with.
What are its chances of being there after being "sorted"? Well, in
order for it to still be there, it would have to survive N rounds of
selection. In each selection, it has 1/2 chance of surviving. That
means its total chance of survival is 1/(2**N), which is much less than
1/N. QED!

***

After writting that down, I thought of an argument for an N*log(N)
algorithm, which would cause the resulting list to be uniformly random:
tournament sort (I think this is also called binary tree sort). How
many rounds does an element have to win in order to come out on top? A
number which approaches log2(N). This is like before, except this
element doesn't have to survive as many rounds; therefore, it's total
chance of coming out on top is 1/(2**log2(N)) ==  1/N. Hoorah!

***

After considering that, I realized that even if your idea to shuffle a
list did work (can't tell because I don't know how Python's sort
works), it would not be an optimal way to shuffle a list even though
Python uses an N*log(N) sort (at least I hope so :P). This is because
you can shuffle in time proportional the the length of the list.

You can accomplish this by doing what I will call "random selection".
It would be like linear selection, except you wouldn't bother checking
the head against every other element. When you want to figure out what
to put at the head, just pick at random! I suspect this is what Python
does in random.shuffle, since it is rather an easy to see it would
result in something uniform (I swear I haven't looked at the source
code for random.shuffle :P).




More information about the Python-list mailing list