Random Drawing Simulation -- performance issue

Robert Kern robert.kern at gmail.com
Tue Sep 12 23:46:04 EDT 2006


Paul Rubin wrote:
> "Travis E. Oliphant" <oliphant.travis at ieee.org> writes:
>>> I need to simulate scenarios like the following: "You have a deck of
>>> 3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,
>>> replace it, and repeat N times."
>>>
>> Thinking about the problem as drawing sample froms a discrete
>> distribution defined by the population might help.
> 
> Is there some important reason you want to do this as a simulation?
> And is the real problem more complicated?  If you draw from the
> distribution 100,000 times with replacement and sum the results, per
> the Central Limit Theorem you'll get something very close to a normal
> distribution whose parameters you can determine analytically.  There
> is probably also some statistics formula to find the precise error.
> So you can replace the 100,000 draws with a single draw.

Along the lines of what you're trying to get at, the problem that the OP is 
describing is one of sampling from a multinomial distribution.

   http://en.wikipedia.org/wiki/Multinomial_distribution

numpy has a function that will do the sampling for you:


In [4]: numpy.random.multinomial?
Type:           builtin_function_or_method
Base Class:     <type 'builtin_function_or_method'>
String Form:    <built-in method multinomial of mtrand.RandomState object at 
0x3e140>
Namespace:      Interactive
Docstring:
     Multinomial distribution.

     multinomial(n, pvals, size=None) -> random values

     pvals is a sequence of probabilities that should sum to 1 (however, the
     last element is always assumed to account for the remaining probability
     as long as sum(pvals[:-1]) <= 1).


Sampling from the multinomial distribution is quite simply implemented given a 
binomial sampler. Unfortunately, the standard library's random module does not 
have one. If the number of samples is high enough, then one might be able to 
approximate the binomial distribution with a normal one, but you'd be better off 
just installing numpy.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list