Random Drawing Simulation -- performance issue

David J. Braden dbraden at invalid.add
Wed Sep 13 12:12:13 EDT 2006


Travis E. Oliphant wrote:
> Brendon Towle wrote:
>> I need to simulate scenarios like the following: "You have a deck of  
>> 3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,  
>> replace it, and repeat N times."
>>
> 
> Thinking about the problem as drawing sample froms a discrete 
> distribution defined by the population might help.
> 
> For example, in SciPy you can define your own discrete random variable:
> 
> var = scipy.stats.rv_discrete(name='sample', 
> values=([0,1,2],[3/10.,5/10.,2/10.]))
> 
> Then
> 
> var.rvs(size=10000)
> 
> would quickly return an array of "draws"
> 
> If you didn't want to install SciPy, but were willing to install NumPy, 
> then the crux of the algorithm is to generate an entire array of uniform 
> random variables:  numpy.random.rand(count) and then map them through 
> the inverse cumulative distribution function to generate your samples. 
>  The inverse cumulative distribution function is just a series of steps 
> whose width depends on the probablity distribution.
> 
> Thus, the population defines the draw boundaries.  To make it easy, just 
> multiply the uniform random number by the total number of cards and then 
> the boundaries are on the integers of the number of each kind of card.
> 
> Here is an implementation.  I find this version to be 2x - 5x faster 
> depending on how many draws are used.
> 
> 
> import numpy as N
> 
> def randomDrawing_N(count, population):
>     probs = [x[0] for x in population]
>     res = [[0, item[1]] for item in population]
>     nums = N.random.rand(count)
>     cprobs = N.cumsum([0]+probs)
>     nums *= cprobs[-1]
>     for k, val in enumerate(probs):
>         res[k][0] += ((nums > cprobs[k]) & (nums < cprobs[k+1])).sum()
>     return res
> 
> 
> -Travis Oliphant
> 

In response to what Travis and Simon wrote -
(1) Where the heck can I find a description of scipy's stat functions? 
Documentation on these seems sparse.

(2) How does one set up a good timer for Python as implemented for 
Windows? (i.e., how can I make API calls to Windows from Python?)

Please bear with me, or not; I am /just/ starting off with Python.

Thanks,
Dave Braden



More information about the Python-list mailing list