fastest method to choose a random element

Arnaud Delobelle arnodel at googlemail.com
Sat Jan 5 07:45:08 EST 2008


On Jan 4, 7:55 pm, c... at mailinator.com wrote:
>   Hello,
>   This is a question for the best method (in terms of performance
> only) to choose a random element from a list among those that satisfy
> a certain property.
>
>   This is the setting: I need to pick from a list a random element
> that satisfies a given property. All or none of the elements may have
> the property. Most of the time, many of the elements will satisfy the
> property, and the property is a bit expensive to evaluate. Chance of
> having the property are uniform among elements.

The generator function below yields an infinite sequence of randomly
picked elements from the list who satisfy the property, or nothing if
the list contains no element satisfying the property.  It guarantees
that each time, prop() will either not be called or will be called
just enough to find one single item that satisfies it.  The catch is
that you need to have an estimate for the number of items that satisfy
the property in the list.

import random
from itertools import islice, ifilter

def picker(lst, prop, np):
    # lst: list of items to pick elements from
    # prop: property that picked elements must fulfil
    # np: (good estimate of) number of items that
    #     satisfy the property
    random.shuffle(lst)
    plst = [] # items we know fulfil prop
    for item in ifilter(prop, lst):
        # The next random item may be one already yielded
        while True:
            i = random.randrange(np)
            if i >= len(plst): break
            yield plst[i]
        # Or it may be a new one
        plst.append(item)
        if len(plst) > np: np = len(plst)
        yield item
    # Now we know all items fulfilling prop
    if not plst: return
    while True:
        yield plst[random.randrange(len(plst))]


def test(picker, n=1000):
    res = {}
    for val in islice(picker, n):
        res[val] = res.get(val, 0) + 1
    return res


Example:

>>> p = picker(range(20), lambda x: x % 2, 10)
>>> test(p)
{1: 113, 3: 96, 5: 87, 7: 91, 9: 109, 11: 91, 13: 106, 15: 101, 17:
109, 19: 97}


>>> p = picker(range(20), lambda x: False, 10)
>>> p.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

I don't know if there is a good idea in there, I'll let you be the
judge :)

--
Arnaud




More information about the Python-list mailing list