[Tutor] A file containing a string of 1 billion random digits.

Mon Jul 19 15:45:43 CEST 2010

Richard D. Moores wrote:

> On Mon, Jul 19, 2010 at 04:51, Peter Otten <__peter__ at web.de> wrote:
>> bob gailer wrote:
>>
>>> Check this out:
>>>
>>> import random, time
>>> s = time.time()
>>> cycles = 1000
>>> d = "0123456789"*100
>>> f = open("numbers.txt", "w")
>>> for i in xrange(n):
>>> l = []
>>> l.extend(random.sample(d, 1000))
>>> f.write(''.join(l))
>>> f.close()
>>> print time.time() - s
>>
>> Note that this is not random. E. g. the start sequence "0"*101 should
>> have a likelyhood of 1/10**101 but is impossible to generate with your
>> setup.
> I not sure exactly what you mean, because I don't fully understand
> that '*' (despite Alan's patient explanation), but if you run
> 
> import random
> cycles = 100000
> d = "0123456789"*10
> for i in range(cycles):
> l = []
> l.extend(random.sample(d, 100))
> s = (''.join(l))
> if s[:4] == '0101':
> print(s)
> 
> You'll see a bunch of strings that begin with "0101"
> 
> Or if you run
> 
> import random
> cycles = 50
> d = "0123456789"*10
> for i in range(cycles):
> l = []
> l.extend(random.sample(d, 100))
> s = (''.join(l))
> if s[:1] == '0':
> print(s)
> 
> You'll see some that begin with '0'.
> 
> Am I on the right track?

No. If you fire up your python interpreter you can do

>>> "0"*10
'0000000000'

i. e. "0"*101 is a sequence of 101 zeros. Because a sample can pick every 
item in the population only once and there are only 100 zeros, at most 100 
of them can be drawn, and the more are drawn the less likely it becomes that 
another one is drawn. The simplest demo is probably

random.sample([0, 1], 2)

Possible returns are [0, 1] and [1, 0], but for true randomness you want [1, 
1] and [0, 0], too. The more often the items are repeated the less 
pronounced that bias becomes, e. g.

random.sample([0, 1, 0, 1], 2)

can produce all combinations, but [0, 1] is twice as likely as [0, 0] 
because once the first 0 is drawn there is only one 0 left, but two 1s.
Here's a demonstration:

>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> for i in range(1000):
...     d[tuple(random.sample([0, 1]*2, 2))] += 1
...
>>> dict(d)
{(0, 1): 333, (1, 0): 308, (0, 0): 174, (1, 1): 185}

Peter