Testing random

Jussi Piitulainen jpiitula at ling.helsinki.fi
Mon Jun 8 03:40:46 EDT 2015


Thomas 'PointedEars' Lahn writes:
> Jussi Piitulainen wrote:
>> Thomas 'PointedEars' Lahn writes:
>>
>>>   8 3 6 3 1 2 6 8 2 1 6.
>> 
>> There are more than four hundred thousand ways to get those numbers
>> in some order.
>> 
>> (11! / 2! / 2! / 2! / 3! / 2! = 415800)
>
> Fallacy.  Order is irrelevant here.

You need to consider every sequence that leads to the observed counts.
One of those sequences occurred. You don't know which.

When tossing herrings into a fixed number of boxes (each time picking
the number of one of the boxes), the proportion of the ways to hit every
box at least once increases with the number of herrings as follows.

Proportions of occupying sequences of tosses into 3 boxes:
 2 herrings:  0% == 0 out of 9
 3 herrings: 22% == 6 out of 27
 4 herrings: 44% == 36 out of 81
 5 herrings: 62% == 150 out of 243
 6 herrings: 74% == 540 out of 729
 7 herrings: 83% == 1806 out of 2187
 8 herrings: 88% == 5796 out of 6561
 9 herrings: 92% == 18150 out of 19683
10 herrings: 95% == 55980 out of 59049
11 herrings: 97% == 171006 out of 177147
12 herrings: 98% == 519156 out of 531441

That's counting the sequences of box numbers, so those proportions can
be interpreted as probabilities (of occupying every box) under the
standard assumption that the sequences are equiprobable. The final
distributions of counts aren't. (How is this even counter-intuitive?)

Code follows. Incidentally, I'm not feeling smart here. I made several
quite stupid mistakes during this little experiment before I was happy
with it. Note that this actually walks through a combinatorial
explosion, so do not substitute much bigger parameters.

from itertools import product
from collections import Counter

boxes = 3
print('Proportions of occupying sequences of tosses into {} boxes:'
      .format(boxes))
for herrings in range(2, 13):
    winnings = sum(len(Counter(p)) == boxes
                   for p in product(range(boxes),
                                    repeat = herrings))
    print('{:>2} herrings:'.format(herrings),
          '{:>3.0%}'.format(winnings / (boxes ** herrings)),
          '== {} out of {}'.format(winnings, boxes ** herrings))



More information about the Python-list mailing list