Smarter way of doing this?

Tue Feb 3 03:18:19 EST 2004

Raymond Hettinger wrote:

> Here is a simplified version of the approach you took:
> 
> def select(tab, random=random.random):
>     cp = 0
>     r = random()
>     for i, p in enumerate(tab):
>         cp += p
>         if cp > r:
>             return i
>     raise Exception('probabilities do not add upto 1.0')

> * re-arrange the tables to make sure the largest probabilities come
> first. Intentionally or not, your sample data is already arranged this
> way (though the probabilities add up to more than 1.0 and should be
> scaled back to 1.0).

It doesn't really have to be scaled to 1.0 as long as they get selected 
with a frequency that corresponds to their probability.

> IOW, if speed is what is important, then be sure to exploit any known
> structure in the data (ordering, cumulativeness, integer rations, etc)
> and don't compute anything twice (pre-compute the summation).

Lot's of good stuff. Thanks.

Max M