[Tutor] Weighted Random Choice - Anyone have an efficient algorithm?

Thu Dec 23 08:49:04 CET 2010

Does anyone know of an efficient way of doing a weighted random
choice? (I don't even know what algorithms like this would be called.)
Preferably, something that doesn't grow exponentially with the number
of elements in the list, or the size of their respective values.

For example: Assume I have a list of integer elements. I want to pick
an index from that list, using the value of said elements to control
the probability of selecting a given index:

w = [5, 20, 75]
wrand(w)

Where wrand() will return '2', (the index of 75),  about 75% of the time.
It would return '1', (the index of 20) about 20% of the time, and 0,
about 5% of the time. It could return the actual value, but returning
the index seems more flexible.

The implementation doesn't have to work exactly like this, even if the
weight values don't add up to 100, or are arbitrary, that's fine.
Hopefully you get the idea. Here's what I tried (it works, but is
slow):

### Begin Code Example ###
import random
random.seed(2) #<-- So we can reproduce the sequence for testing.

def wrandom(weights):
    '''
    Return a randomly selected index of the 'weights' list, using the
    values of that list to affect probability of said index being returned.
    '''
    debug = False
    flattened = []
    for i, w in enumerate(weights):
        if debug: print "i, w: ", i, w
        for k in range(w):
            flattened.append(i)
    if debug: print "flattened: ", flattened
    rnd = random.randint(0, (len(flattened) - 1))
    return flattened[rnd]

# Not test it:
print wrandom([5, 20, 75])
print wrandom([5, 20, 75])
print wrandom([5, 20, 75])

### End Code Example ###

It works and is easy enough to understand, but when the number of list
items gets large, or the weights get heavy, things get ugly.
-Modulok-