How do I sample randomly based on some probability(wightage)?

Dave Angel davea at ieee.org
Tue May 26 16:41:35 EDT 2009


Sumitava Mukherjee wrote:
> Hi all,
> I need to randomly sample from a list where all choices have weights
> attached to them. The probability of them being choosen is dependent
> on the weights.
> If say Sample list of choices are [A,B,C,D,E] and weights of the same
> are [0.895,0.567,0.765,0.890,0.60] when I draw (say 2) samples then I
> want the likeliness of them being chosen be in the order : D>A>C>E>B
>
> In short I mean if prob of a H is .9 and probability of T be 0.1 then
> if I draw 10 samples, 9 should be H and 1 should be T.
>
> I coudn't find a function in the module random that does so.
> Please can someone guide me how the above could be implemented [either
> through some function which exists and I don't know or pointers to
> some code snippets which does so]?
>
>   
Emile gave you a pretty elegant solution, assuming two things:  The 
choices can be represented by a single character each, and a roundoff 
error of one part in a thousand is acceptable.

Of course, you can represent 256 choices in a 2.x character, and you 
could switch to Unicode if that's not enough.  And if 1000 isn't enough, 
you could make it bigger, at the cost of needing a bigger table.

Here's my more-straightforward algorithm, not reduced to code.

Start with a list of probabilities.  Replace each item with the sum of 
all the items up to and including itself.  (If you do it in-place, you'd 
need to do it in reverse order, but if you use a copy, you could simply 
replace each item with a sum() of the appropriate slice of the copy.


Anyway, now you've got a list of cumulative probabilites, with the last 
item equalling 1.0 (pretty close).  You might need to fudge that to 
exactly 1.0, just in case.

I think I'd zip that list with the original list of items, so that you 
have a single list of tuples.

Now to generate a random sample, call  random.random() to get a value 
between 0 and 1. Search the list for the first item greater than or 
equal to your value, and return the other half of the tuple.





More information about the Python-list mailing list