shuffling elements of a list

David C. Ullrich ullrich at math.okstate.edu
Wed May 31 06:17:13 EDT 2006


On 30 May 2006 21:53:32 -0700, "greenflame" <alikakakhel at yahoo.com>
wrote:

>Thank you all for all of your help. Also I got the shuffle function to
>work. Do not worry I will be back soon with more shuffling! However, I
>do not quite understand this DSU that you mention, although it looks
>useful.

I didn't see any DSU in his post, although I didn't read
it very carefully. DSU is "Decorate Sort Undecorate" -
it's an idiom for efficient sorting. Say you have a list
and you want to sort it using some custom compare function.
That can be very slow. Sorting a list with the builtin
comparison is much faster.

DSU is a trick that lets you use the fast builtin comparison
to sort according to a custom compare. Say you have a list 
[x,y,z], these objects have an attribute "a", and you want
to sort on the "a" field. You "decorate" the list, 
constructing a new list of tuples 

   [(x.a, x), (y.a, y), (z.a, z)]

You sort the decorated list using the standard
sort(). Tuples get compared lexicographically,
so this sorts on the "a" field. Then you
"undecorate" the sorted list, stripping
off the first element of each tuple.

******************

That's DSU for _sorting_ a list. I read about this, thought
it was pretty neat. I thought that the fact that you
could use the same trick for _shuffling_ a list was
my idea, gonna make me rich and famous. I guess I'm
not the only one who thought of it. Anyway, you can
use DSU to _shuffle_ a list by decorating the list
with random numbers.

In fairly old-fashioned Python:

from random import random

def shuffle(data):
  decorated = map(lambda x: (random(), x), data)
  decorated.sort()
  return map(lambda x: x[1], decorated)

print shuffle(range(10))

This prints

[4, 2, 7, 8, 9, 3, 5, 1, 6, 0]

. Seems kinda neat - I have no idea how the efficiency
compares with the standard sort of "bubble shuffle"
you were trying to use in your OP, but the point is
that various off-by-one errors simply don't come up.

(The same thing in extremely awful Python, in case
you're mystified by map and lambda:

from random import random

def shuffle(data):
  decorated = []
  for x in data:
    decorated.append((random(), x))
  decorated.sort()
  res = []
  for x in decorated:
    res.append(x[1])
  return res

.)

************************

David C. Ullrich



More information about the Python-list mailing list