fastest method to choose a random element
Neil Cerutti
mr.cerutti at gmail.com
Fri Jan 4 15:47:11 EST 2008
On Jan 4, 2008 2:55 PM, <caca at mailinator.com> wrote:
> Hello,
> This is a question for the best method (in terms of performance
> only) to choose a random element from a list among those that satisfy
> a certain property.
I would automatically use random.choice(filter(pred_func, a_list)). You just
have to catch the possible IndexError.
This is the setting: I need to pick from a list a random element
> that satisfies a given property. All or none of the elements may have
> the property. Most of the time, many of the elements will satisfy the
> property, and the property is a bit expensive to evaluate. Chance of
> having the property are uniform among elements.
>
> A simple approach is:
>
> import random
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
>
> Returns None if no element has the property
> '''
> random.shuffle(a_list)
> for i in a_list:
> if property(i): return i
I'm pretty sure you don't want to use a destructive random_pick function.
You'll have to shuffle a copy instead to avoid that problem.
>
> but that requires to shuffle the list every time.
>
> A second approach, that works if we know that at least one element of
> the list has the property, is:
>
> import random
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
>
> Loops forever if no element has the property
> '''
> while 1:
> i=random.choice(a_list)
> if property(i): return i
>
> which is more efficient (on average) if many elements of the list have
> the property and less efficient if only few elements of the list has
> the property (and goes crazy if no element has the property)
That's awful. Here's another linear time idea, returning the nearest element
that satisfies the predicate.
offset = random.randrange(len(a_list))
for n in xrange(len(a_list)):
ix = (offset + n) % len(a_list)
if predicate(a_list[ix]):
return a_list[ix]
raise ValueError('no element has the property')
The possible problem is that large strings of elements in a row that don't
match the predicate greatly increase the odds of getting the following
element that *does* match the predicate. Worst case is two predicate matched
elements in a row, surrounded by a bunch of non-matched elements.
> Yet another one:
>
> import random
> def random_pick(a_list,property):
> '''Returns a random element from a list that has the property
> '''
> b_list=[x for x in a_list if property(x)]
> try:
> return random.choice(b_list)
> finally: return None
>
> but this one checks the property on all the elements, which is no
> good.
Is it really expensive to check the property? That would mitigate against
the filter solution and for the other one I posted.
This seems to be a case of trying to solve a data problem functionally. It'd
be better to store your data differently if this will be a frequent
operation and you simply can't afford to call the predicate on all the
elements.
Incidentally, try not to shadow builtin names like 'property'.
--
Neil Cerutti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080104/d35f428e/attachment-0001.html>
More information about the Python-list
mailing list