[Tutor] Weighted Random Choice - Anyone have an efficient algorithm?

Fri Dec 24 10:40:06 CET 2010

Hi Steven,

Doesn't this qualify as 'monkeying with the loop index'? [*]

>>> import random
>>> weights = [5, 20, 75]
>>> counts = {0:0, 1:0, 2:0}
>>> for i in xrange(1000000):
...     i = weighted_choice(weights) # <--- monkeying right here (?)
...     counts[i] += 1

[*] http://stackoverflow.com/questions/457036/dont-monkey-with-the-loop-index

 Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have the 
Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

________________________________
From: Steven D'Aprano <steve at pearwood.info>
To: tutor at python.org
Sent: Thu, December 23, 2010 1:30:50 PM
Subject: Re: [Tutor] Weighted Random Choice - Anyone have an efficient 
algorithm?

Modulok wrote:
> Does anyone know of an efficient way of doing a weighted random
> choice? (I don't even know what algorithms like this would be called.)

If you google for "python weighted random choice" you will find a number of 
hits.

> Preferably, something that doesn't grow exponentially with the number
> of elements in the list, or the size of their respective values.

Here's one method that is linear on the number of elements:

def weighted_choice(weights):
    total = sum(weights)
    p = random.uniform(0, total)  # random float between 0 and total.
    assert 0.0 <= p <= total
    # Do a linear search for the right index value. If you have a
    # huge number of weights, a binary search may be faster.
    running_total = 0
    for i, weight in enumerate(weights):
        running_total += weight
        if p <= running_total:
            return i

And tested:

>>> import random
>>> weights = [5, 20, 75]
>>> counts = {0:0, 1:0, 2:0}
>>> for i in xrange(1000000):
...     i = weighted_choice(weights)
...     counts[i] += 1
...
>>> counts
{0: 50252, 1: 199997, 2: 749751}
>>> [n*1e6/100 for n in weights]  # expected values
[50000.0, 200000.0, 750000.0]

As you can see, the results are very close to what should be expected, and of 
course the difference can be chalked up to random chance.

-- Steven

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20101224/4c87868d/attachment-0001.html>