[Numpy-discussion] weighted random integers

Fri Sep 10 20:44:35 EDT 2010

On Fri, Sep 10, 2010 at 6:32 PM, <josef.pktd at gmail.com> wrote:

> On Fri, Sep 10, 2010 at 8:28 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Fri, Sep 10, 2010 at 6:15 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >>
> >> On Fri, Sep 10, 2010 at 5:46 PM, <josef.pktd at gmail.com> wrote:
> >>>
> >>> I saw some questions on the web how to create non-uniform random
> >>> integers in python.
> >>>
> >>> I don't know what the best way is but here is another way that looks
> >>> reasonably fast
> >>>
> >>> >>> rvs = np.dot(np.random.multinomial(1, [0.1, 0.2, 0.5, 0.2],
> >>> >>> size=1000000),np.arange(4))
> >>>
> >>> >>> np.bincount(rvs)/1000000.
> >>> array([ 0.099741,  0.199943,  0.499317,  0.200999])
> >>>
> >>
> >> This looks like a good case for the inverse cdf approach, at least for
> >> smallish ranges of integers. Searchsorted on an array of appropriate
> values
> >> should do the trick.
> >>
> >
> > For instance, weight 0..3 by 1..4, then
> >
> > In [14]: w = arange(1,5)
> >
> > In [15]: p = cumsum(w)/float(w.sum())
> >
> > In [16]: bincount(p.searchsorted(random(1000000)))/1e6
> > Out[16]: array([ 0.100336,  0.200382,  0.299132,  0.40015 ])
>
> Looks good, it feels faster and takes less memory, I guess.
>
>
It's about 10x faster.

In [30]: timeit p.searchsorted(random(1000000))
10 loops, best of 3: 37.6 ms per loop

In [31]: timeit np.dot(np.random.multinomial(1, [0.1, 0.2, 0.5, 0.2],
size=1000000),np.arange(4))
1 loops, best of 3: 363 ms per loop

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100910/aef48255/attachment.html>