Probability selection algorithm

Manuel M. Garcia mail at manuelmgarcia.com
Sat Feb 1 19:41:24 EST 2003


On 1 Feb 2003 14:43:29 -0800, noah at noah.org (Noah) wrote:

>    [0.2, 0.2, 0.2, 0.2, 0.2]
>        items all have equal chance of getting selected (20%)
>    [0.33, 0.33, 0.33]
>        items all have equal chance of getting selected (33%)
>    [0.33, 0.66]
>        item 1 is twice as likely to get selected as item 0.
>    [0.46, 0.27, 0.27]
>        item 0 has slightly less than 50% chance of selection...
>    [0.17, 0.62, 0.21]
>        items have all unequal chances of selection.

I would recommend using the bisect module.  I used generators, I
didn't bother to profile this code, but I am pretty sure it works.

# ################################ #

from __future__ import generators

import random
import bisect
import operator

# will generalize problem
# [1, 2, 3]
# means 1 is selected twice as often as 0
#       2 is selected 3 times as often as 0

def normalize(list0):
    t = reduce(operator.add, list0) * 1.0
    return [ x / t for x in list0 ]

def running_sum(list0):
    return [
        reduce(
            operator.add,
            list0[:i+1] )
        for i in range(len(list0)) ]

def my_random_numbers(list0, n):
    list0 = running_sum(normalize(list0))
    for _ in xrange(n):
        yield bisect.bisect_left(
            list0,
            random.random() )

def do_count(list0):
    d = {}
    for x in list0:
        d[x] = (
            d.get(x, 0) +
            1 )
    d = d.items()
    d.sort()
    return d

print do_count(
    my_random_numbers(
        [1,2,3],
        6000000) )

--------------------------------

[(0, 1000701), (1, 2000057), (2, 2999242)]

--------------------------------

Manuel





More information about the Python-list mailing list