Testing random

Tue Jun 16 16:48:04 EDT 2015

On Tuesday, June 16, 2015 at 3:21:46 PM UTC-4, Thomas 'PointedEars' Lahn wrote:
> Ned Batchelder wrote:
> 
> > You aren't agreeing because you are arguing about different things.
> > Thomas is talking about the relative probability of sequences of digits.
> 
> There is no such thing as “relative probability”, except perhaps in popular-
> scientific material and bad translations.  You might mean relative 
> _frequency_, but I was not talking about that specifically.
> 
> > Chris is talking about the probability of a single digit never appearing
> > in the output.
> 
> I do not think that what I am talking about and what you think Chris is 
> talking about are different things.
> 
> > Thomas: let's say I generate streams of N digits drawn randomly from 0-9.
> > I then consider the probability of a zero *never appearing once* in my
> > stream.  Let's call that P(N)
>  
> In probability theory, it is called the probability P(E) of the event E that 
> in n trials the probability variable X never assumes the value 0, which can 
> be defined as
> 
>   P(E), E = {e_i | n ∈ ℕ \ {0}, i = 1, …, n} \ {X ≠ 0}, Ω = {1, 2, …, 9} 
> 
> where the e_i are the singular events, or outcomes, of the probabilistic 
> experiment, and Ω is the sample space of the e_i.
> 
> > Do you agree that as N increases, P(N) decreases?
> 
> I do not agree that P(E), as defined above, decreases as n increases.
> 
> See also: <http://rationalwiki.org/wiki/Gambler%27s_fallacy>

I apologize, I'm sure I've been using the mathematical terms imprecisely.
We are all intelligent people, so I still believe we disagree because we
are talking about different things.

To put us all on a footing where (I hope) we have a shared understanding,
this Python program demonstrates what I was talking about:

import random

def die_roll():
    """Roll the die once, produce a number 0-9."""
    return random.randint(0, 9)

def die_rolls(n):
    """Roll the die n times, produce a list of numbers from 0-9."""
    return [die_roll() for _ in xrange(n)]

def any_zeros(seq):
    """Is there any zero in `seq`?"""
    return any(x == 0 for x in seq)

def probability_of_no_zero(nrolls, nseq):
    """Determine the chance of no zero in a sequence of rolls.

    This is done empirically, by producing `nseq` sequences of
    `nrolls` rolls of the die.  Each sequence is examined to
    see if it has a zero.  The total number of no-zero
    sequences divided `nseq` is the probability.
    """
    no_zeros = 0
    for _ in xrange(nseq):
        seq = die_rolls(nrolls)
        if not any_zeros(seq):
            no_zeros += 1
    return float(no_zeros)/nseq

for n in range(10, 101, 10):
    # Calculate the probability of getting no zeros by trying
    # it a million times.
    prob = probability_of_no_zero(n, 1000000)
    print "n = {:3d}, P(no zero) = {:.8f}".format(n, prob)

Running this gives:

$ pypy testrandom.py
n =  10, P(no zero) = 0.34867300
n =  20, P(no zero) = 0.12121900
n =  30, P(no zero) = 0.04267000
n =  40, P(no zero) = 0.01476600
n =  50, P(no zero) = 0.00519900
n =  60, P(no zero) = 0.00174100
n =  70, P(no zero) = 0.00061600
n =  80, P(no zero) = 0.00020600
n =  90, P(no zero) = 0.00006300
n = 100, P(no zero) = 0.00002400

As n increases, the probability of having no zeros goes down.

--Ned.