number generator

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sun Mar 11 00:26:15 EST 2007


On Sat, 10 Mar 2007 17:19:38 -0800, MonkeeSage wrote:

> On Mar 10, 6:47 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>> The fencepost method still seems to be simplest:
>>
>>     t = sorted(random.sample(xrange(1,50), 4))
>>     print [(j-i) for i,j in zip([0]+t, t+[50])]
> 
> Simpler, true, but I don't think it gives any better distribution...

[snip]

> Granted, I'm just eyeballing it, but they look fairly equal in terms
> of distribution.

It's easy enough to see whether the fencepost method gives a uniform
distribution.


def fence(n, m):
    t = [0] + sorted(random.sample(xrange(1, m), n-1)) + [m]
    L = [(j-i) for i,j in zip(t[:-1], t[1:])]
    assert sum(L) == m
    assert len(L) == n
    return L


def collate(count, n, m):
    bins = {}
    for _ in xrange(count):
        L = fence(n, m)
        for x in L:
            bins[x] = 1 + bins.get(x, 0)
    return bins


collate(1000, 10, 80)

gives me the following sample:

{1: 1148, 2: 1070, 3: 869, 4: 822, 5: 712, 
6: 633, 7: 589, 8: 514, 9: 471, 
10: 406, 11: 335, 12: 305, 13: 308, 14: 242, 
15: 232, 16: 190, 17: 172, 18: 132, 19: 132, 
20: 124, 21: 87, 22: 91, 23: 72, 24: 50, 
25: 48, 26: 45, 27: 33, 28: 29, 29: 22, 
30: 19, 31: 20, 32: 12, 33: 12, 34: 11, 
35: 11, 36: 5, 37: 9, 38: 2, 39: 4, 40: 5, 
42: 1, 43: 3, 45: 2, 49: 1}

Clearly the distribution isn't remotely close to uniform.

To compare to the "cheat" method, calculate the mean and standard
deviation of this sample, and compare to those from the other method.
(Code left as an exercise for the reader.) When I do that, I get a mean
and standard deviation of 8 and 6.74 for the fencepost method, and 8 and
4.5 for the "cheat" method. That implies they are different distributions.



-- 
Steven.




More information about the Python-list mailing list