Probability selection algorithm

Bengt Richter bokr at oz.net
Sat Feb 1 21:04:23 EST 2003


On 01 Feb 2003 15:12:30 -0800, Paul Rubin <phr-n2003b at NOSPAMnightsong.com> wrote:

>noah at noah.org (Noah) writes:
>> I can generate a random number between 0.0 and 1.0,
>> but how do I turn that into an index into a given list?
>> I was thinking that I could convert the lists of probabilities
>> into lists of probability ranges. Then I would choose a random number
>> between 0 and 1 and then select the item that falls in that range.
>> For example the last list [0.17, 0.62, 0.21] would be converted to
>> [0.17,  0.79,  1.0]
>>     if the random number is in the range 0.0-0.17 then select the first item.
>>     if the random number is in the range 0.17-0.79 then select the second item.
>>     if the random number is in the range 0.79-1.0 then select the third item.
>> 
>> That seems like a crufty algorithm. There must be a simpler way.
>
>Try something like:
>
>    remaining = 1.0
>    for i range(len(problist)):
>       if urand() < remaining:
>         selected = i
>       remaining -= problist[i]
>
>where urand returns a random number between 0.0 and 1.0 and
>problist is your list of probabilities.  "selected" is set to
>the appropriate randomly chosen index.
>
>Do I have that right?
Seems like it's wasting a lot of darts compared to throwing one
at a board scribed with the right lines, which is what the OP was
describing. Maybe it's a good use for the bisect module, e.g.,

     i = bisect.bisect_left(problevels, urand())

====< t_prob.py >===============================
import bisect, sys
from random import random as urand
# probs = [0.17, 0.62, 0.21]  # maybe not exactly 1.0 -- assume sum << 1.0 ok
probs = map(float, sys.argv[1:])
levels = []
for prob in probs:
    levels.append((levels and levels[-1] or 0) + prob)

test = [0]*(len(probs)+1)
for x in xrange(10000):
    i = bisect.bisect_left(levels, urand())
    test[i] += 1

for i in xrange(len(probs)):
    print '%5.3f -> %s' % (probs[i], test[i])
================================================

[18:09] C:\pywk\clp>t_prob.py 0.17 0.62 0.21
0.170 -> 1694
0.620 -> 6173
0.210 -> 2133

[18:10] C:\pywk\clp>t_prob.py 0.17 0.62 0.21
0.170 -> 1681
0.620 -> 6228
0.210 -> 2091

[18:10] C:\pywk\clp>t_prob.py 0.17 0.62 0.21
0.170 -> 1747
0.620 -> 6105
0.210 -> 2148

[18:10] C:\pywk\clp>t_prob.py .5 .5
0.500 -> 5025
0.500 -> 4975

Regards,
Bengt Richter




More information about the Python-list mailing list