[Numpy-discussion] np.random.multinomial weird results

Sat Mar 7 19:41:08 EST 2009

On Sat, Mar 7, 2009 at 6:57 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sat, Mar 7, 2009 at 17:29,  <josef.pktd at gmail.com> wrote:
>> np.random.multinomial  looks weird. Are these bugs, or is there
>> something not correct with the explanation.
>
> I would like to know how you are interpreting the documentation.
>
>> Josef
>>
>> from the help/ docstring:
>>
>>>>> np.random.multinomial(20, [1/6.]*6, size=2)
>> array([[3, 4, 3, 3, 4, 3],
>>       [2, 4, 3, 4, 0, 7]])
>> For the first run, we threw 3 times 1, 4 times 2, etc. For the second,
>> we threw 2 times 1, 4 times 2, etc.
>>
>>
>> Note: we also get a 7 in a six sided dice
>
> No you don't. That value means that in the second trial of 20 tosses,
> you rolled a 6-spot seven times. The result of drawing from a
> multinomial distribution is the number of times a particular result
> came up, *not* the results themselves.
>
>> some more examples with a funny shaped six sided dice:
>>
>>>>> rvsmn=np.random.multinomial(20, [1/6.]*6, size=2000)
>>>>> for i in range(rvsmn.min(),rvsmn.max()+1):print i, (rvsmn==i).sum(0)/20.0
>>
>> 0 [ 2.9   2.25  2.45  2.55  2.65  2.85]
>> 1 [  9.15   9.75  10.8   11.4   11.1   10.7 ]
>> 2 [ 20.8   20.    20.25  19.65  18.9   19.2 ]
>> 3 [ 23.75  24.4   23.3   22.75  23.5   23.15]
>> 4 [ 20.85  20.8   20.4   20.95  20.15  19.25]
>> 5 [ 12.6   12.55  12.6   12.55  13.3   14.75]
>> 6 [ 6.4   6.65  6.95  6.55  6.8   6.35]
>> 7 [ 2.8   2.25  2.45  2.8   2.55  2.75]
>> 8 [ 0.5   0.85  0.55  0.55  0.85  0.85]
>> 9 [ 0.2   0.4   0.15  0.1   0.15  0.05]
>> 10 [ 0.05  0.1   0.1   0.1   0.05  0.1 ]
>> 11 [ 0.    0.    0.    0.05  0.    0.  ]
>
> And? What do you think you are testing here? A more appropriate test would be:
>
> rvsmn = np.random.multinomial(N, np.ones(M)/M, size=L)
> assert is_kinda_close(rvsmn.mean(axis=0) / N, np.ones(M)/M)
> - Show quoted text -
>>>>> rvsmn=np.random.multinomial(1, [1/6.]*6, size=2000)
>>>>> for i in range(rvsmn.min(),rvsmn.max()+1):print i, (rvsmn==i).sum(0)/20.0
>>
>> 0 [ 81.9   83.35  84.85  84.25  83.7   81.95]
>> 1 [ 18.1   16.65  15.15  15.75  16.3   18.05]
>>>>> rvsmn=np.random.multinomial(2, [1/6.]*6, size=2000)
>>>>> for i in range(rvsmn.min(),rvsmn.max()+1):print i, (rvsmn==i).sum(0)/20.0
>>
>> 0 [ 70.45  71.6   68.9   68.1   68.    69.75]
>> 1 [ 26.45  26.1   28.35  28.75  29.6   27.15]
>> 2 [ 3.1   2.3   2.75  3.15  2.4   3.1 ]
>>
>>>>> rvsmn=np.random.multinomial(2000, [1/6.]*6, size=1)
>>>>> rvsmn.shape
>> (1, 6)
>>>>> rvsmn
>> array([[330, 348, 332, 326, 337, 327]])
>>>>> rvsmn=np.random.multinomial(2000, [1/6.]*6)
>>>>> rvsmn.shape
>> (6,)
>>>>> rvsmn
>> array([334, 322, 323, 348, 322, 351])
>>
>>
>> Note: this are the tests for multinomial
>> class TestMultinomial(TestCase):
>>    def test_basic(self):
>>        random.multinomial(100, [0.2, 0.8])
>>
>>    def test_zero_probability(self):
>>        random.multinomial(100, [0.2, 0.8, 0.0, 0.0, 0.0])
>
> These are testing that the call doesn't fail.
>
> --
> Robert Kern
>

Sorry, I was working on a multinomial logit distribution, and even
though I read the docstring np.random.multinomial, I didn't pay enough
attention. So I misinterpreted what the random variable is supposed to
mean and that didn't make any sense.

Now it looks clearer,

Thanks,

Josef