[Numpy-discussion] np.random.multinomial weird results
josef.pktd at gmail.com
josef.pktd at gmail.com
Sat Mar 7 19:41:08 EST 2009
On Sat, Mar 7, 2009 at 6:57 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sat, Mar 7, 2009 at 17:29, <josef.pktd at gmail.com> wrote:
>> np.random.multinomial looks weird. Are these bugs, or is there
>> something not correct with the explanation.
>
> I would like to know how you are interpreting the documentation.
>
>> Josef
>>
>> from the help/ docstring:
>>
>>>>> np.random.multinomial(20, [1/6.]*6, size=2)
>> array([[3, 4, 3, 3, 4, 3],
>> [2, 4, 3, 4, 0, 7]])
>> For the first run, we threw 3 times 1, 4 times 2, etc. For the second,
>> we threw 2 times 1, 4 times 2, etc.
>>
>>
>> Note: we also get a 7 in a six sided dice
>
> No you don't. That value means that in the second trial of 20 tosses,
> you rolled a 6-spot seven times. The result of drawing from a
> multinomial distribution is the number of times a particular result
> came up, *not* the results themselves.
>
>> some more examples with a funny shaped six sided dice:
>>
>>>>> rvsmn=np.random.multinomial(20, [1/6.]*6, size=2000)
>>>>> for i in range(rvsmn.min(),rvsmn.max()+1):print i, (rvsmn==i).sum(0)/20.0
>>
>> 0 [ 2.9 2.25 2.45 2.55 2.65 2.85]
>> 1 [ 9.15 9.75 10.8 11.4 11.1 10.7 ]
>> 2 [ 20.8 20. 20.25 19.65 18.9 19.2 ]
>> 3 [ 23.75 24.4 23.3 22.75 23.5 23.15]
>> 4 [ 20.85 20.8 20.4 20.95 20.15 19.25]
>> 5 [ 12.6 12.55 12.6 12.55 13.3 14.75]
>> 6 [ 6.4 6.65 6.95 6.55 6.8 6.35]
>> 7 [ 2.8 2.25 2.45 2.8 2.55 2.75]
>> 8 [ 0.5 0.85 0.55 0.55 0.85 0.85]
>> 9 [ 0.2 0.4 0.15 0.1 0.15 0.05]
>> 10 [ 0.05 0.1 0.1 0.1 0.05 0.1 ]
>> 11 [ 0. 0. 0. 0.05 0. 0. ]
>
> And? What do you think you are testing here? A more appropriate test would be:
>
> rvsmn = np.random.multinomial(N, np.ones(M)/M, size=L)
> assert is_kinda_close(rvsmn.mean(axis=0) / N, np.ones(M)/M)
> - Show quoted text -
>>>>> rvsmn=np.random.multinomial(1, [1/6.]*6, size=2000)
>>>>> for i in range(rvsmn.min(),rvsmn.max()+1):print i, (rvsmn==i).sum(0)/20.0
>>
>> 0 [ 81.9 83.35 84.85 84.25 83.7 81.95]
>> 1 [ 18.1 16.65 15.15 15.75 16.3 18.05]
>>>>> rvsmn=np.random.multinomial(2, [1/6.]*6, size=2000)
>>>>> for i in range(rvsmn.min(),rvsmn.max()+1):print i, (rvsmn==i).sum(0)/20.0
>>
>> 0 [ 70.45 71.6 68.9 68.1 68. 69.75]
>> 1 [ 26.45 26.1 28.35 28.75 29.6 27.15]
>> 2 [ 3.1 2.3 2.75 3.15 2.4 3.1 ]
>>
>>>>> rvsmn=np.random.multinomial(2000, [1/6.]*6, size=1)
>>>>> rvsmn.shape
>> (1, 6)
>>>>> rvsmn
>> array([[330, 348, 332, 326, 337, 327]])
>>>>> rvsmn=np.random.multinomial(2000, [1/6.]*6)
>>>>> rvsmn.shape
>> (6,)
>>>>> rvsmn
>> array([334, 322, 323, 348, 322, 351])
>>
>>
>> Note: this are the tests for multinomial
>> class TestMultinomial(TestCase):
>> def test_basic(self):
>> random.multinomial(100, [0.2, 0.8])
>>
>> def test_zero_probability(self):
>> random.multinomial(100, [0.2, 0.8, 0.0, 0.0, 0.0])
>
> These are testing that the call doesn't fail.
>
> --
> Robert Kern
>
Sorry, I was working on a multinomial logit distribution, and even
though I read the docstring np.random.multinomial, I didn't pay enough
attention. So I misinterpreted what the random variable is supposed to
mean and that didn't make any sense.
Now it looks clearer,
Thanks,
Josef
More information about the NumPy-Discussion
mailing list