selecting a random item from a set

Tim Peters tim.one at comcast.net
Tue Dec 30 18:56:50 EST 2003


[Alexander Schmolck]
> Quite a few algortihms prominently feature choosing/removing a single
> random element from a set at a time.

Really?  I'm aware of a great many algorithms that need to remove an
*arbitrary* element from a set, where arbitrary means "it doesn't matter
which one you pick", not "you must have an equal chance of getting each
one".  The existing set.pop() is fine for that.

> On first sight I can't think of anything better to achieve this with
> `sets.Set` than something along the lines of:
>
>   for dummy, randomElement in zip(range(randrange(len(s)+1)), s): pass

It's too obscure for my tastes, but something like that would certainly
work.

>   # possibly followed by
>   s.remove(randomElement)
>
> Is there a better way? If not, how about something like
> Set.randompop()?

Probably not.  I'm not convinced there's a real need, and given the
implementation of sets there's no way to do it essentially more efficient
than the way you've already figured out (it can be done using O(1) space
instead, but it still takes O(len(s)) time -- the implementation of sets is
lumpy, so the O(1)-time indexing trick used by random.choice() can't apply).

If you think this is important enough to be worth the bother <wink>, you
could define a hybrid data structure combining a set and a list, keeping
them in synch.  random.choice() can then be applied to the list part to get
a random element in constant time.  The only tricky part is deletion of an
element.  You need another dict under the covers to map each key to its
index in the list.  When it comes time to delete a key, you *overwrite* the
list position it occupied with the last key in the list (and update that
key's index), then reduce the length of the list by 1.  That keeps deletion
constant-time (expected-case) too.






More information about the Python-list mailing list