[Tutor] fast sampling with replacement

Andrew Fithian afith13 at gmail.com
Sat Feb 20 17:22:19 CET 2010


Hi tutor,

I'm have a statistical bootstrapping script that is bottlenecking on a
python function sample_with_replacement(). I wrote this function myself
because I couldn't find a similar function in python's random library. This
is the fastest version of the function I could come up with (I used
cProfile.run() to time every version I wrote) but it's not fast enough, can
you help me speed it up even more?

import random
def sample_with_replacement(list):
    l = len(list) # the sample needs to be as long as list
    r = xrange(l)
    _random = random.random
    return [list[int(_random()*l)] for i in r] # using
list[int(_random()*l)] is faster than random.choice(list)

FWIW, my bootstrapping script is spending roughly half of the run time in
sample_with_replacement() much more than any other function or method.
Thanks in advance for any advice you can give me.

-Drew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20100220/87f8fe84/attachment.htm>


More information about the Tutor mailing list