sampling without replacement

Paul Rubin http
Thu Apr 17 03:41:33 EDT 2008


braver <deliverable at gmail.com> writes:
> Using an array is natural here as it represents "without replacement"
> -- we take an element by removing it from the array.  But in Python
> it's very slow...  What approaches are there to implement a shrinking
> array with random deletions with  the magnitude of millions of
> elements?

The obvious way is use random.sample(), is there some reason you
don't do that?

Alternatively, when you select an element a[k] from the middle of the
array, instead of deleting that element and moving the rest down (del
a[k]), do something like:

   k = random.randint(0,len(a)-1)
   selection = a[k]
   a[k] = a[-1]
   a.pop()

That deletes the last element (avoiding moving them around) after
storing it in the newly freed slot.  Of course it messes up the order
of the array, which won't matter for selecting random elements, but
might matter for some other reason in your program.






More information about the Python-list mailing list