random.sample with long int items

Steven D'Aprano steve at REMOVETHIScyber.com.au
Wed Apr 12 10:15:32 EDT 2006


On Wed, 12 Apr 2006 06:29:01 -0700, jordi wrote:

> I need the random.sample functionality where the population grows up to
> long int items. Do you know how could I get this same functionality in
> another way? thanks in advance.

I'm thinking you might need to find another way to do whatever it is you
are trying to do.

If you can't, you could do something like this:

- you want to randomly choose a small number of items at random from a
population of size N, where N is very large.

e.g. you would do this: random.sample(xrange(10**10), 60)
except it raises an exception.

- divide your population of N items in B bins of size M, where both B and
M are in the range of small integers. Ideally, all your bins will be equal
in size.

e.g. 
bins = [xrange(start*10**5, (start+1)*10**5) \
        for start in xrange(10**5)]


- then, to take a sample of n items, do something like this:

# bins is the list of B bins;
# each bin has M items, and B*M = N the total population.
result = []
while len(result) < sample_size:
    # choose a random bin
    bin = random.choice(bins)
    # choose a random element of that bin
    selection = random.choice(bin)
    if selecting_with_replacement:
        result.append(selection)
    else:
        # each choice must be unique
        if not selection in result:
            result.append(selection)


Hope that helps.


-- 
Steven.




More information about the Python-list mailing list