fastest method to choose a random element

caca at mailinator.com caca at mailinator.com
Fri Jan 4 14:55:51 EST 2008


  Hello,
  This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.

  This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.

  A simple approach is:

import random
def random_pick(a_list,property):
    '''Returns a random element from a list that has the property

    Returns None if no element  has the property
    '''
    random.shuffle(a_list)
    for i in a_list:
        if property(i): return i

but that requires to shuffle the list every time.

 A second approach, that works if we know that at least one element of
the list has the property, is:

import random
def random_pick(a_list,property):
    '''Returns a random element from a list that has the property

    Loops forever if no element has the property
    '''
    while 1:
        i=random.choice(a_list)
        if property(i): return i

which is more efficient (on average) if many elements of the list have
the property and less efficient if only few elements of the list has
the property (and goes crazy if no element has the property)

Yet another one:

import random
def random_pick(a_list,property):
    '''Returns a random element from a list that has the property
    '''
    b_list=[x for  x in a_list if property(x)]
    try:
        return random.choice(b_list)
    finally: return None

but this one checks the property on all the elements, which is no
good.

I don't need strong random numbers, so a simple solution like:
import random
globalRNG=random.Random()

def random_pick(a_list,property):
    '''Returns a random element from a list that has the property

    Works only if len(a_list)+1 is prime: uses Fermat's little theorem
    '''
    a=globalRNG(1,len(a_list))
    ind=a
    for i in xrange(len(a_list)):
        x=a_list[a-1]
        if property(x):return x
        ind*=a

but this works only if len(a_list)+1 is prime!!! Now this one could be
saved if there were an efficient method to find a prime number given
than a number n but not very much greater...

Any other ideas? Thanks everybody



More information about the Python-list mailing list