[Numpy-discussion] non-uniform discrete sampling with given probabilities (w/ and w/o replacement)

Christopher Jordan-Squire cjordan1 at uw.edu
Wed Aug 31 16:58:08 EDT 2011


On Wed, Aug 31, 2011 at 3:34 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Aug 31, 2011 at 3:22 PM, Olivier Delalleau <shish at keba.be> wrote:
>> 2011/8/31 Christopher Jordan-Squire <cjordan1 at uw.edu>
>>>
>>> On Wed, Aug 31, 2011 at 2:07 PM, Olivier Delalleau <shish at keba.be> wrote:
>>> > You can use:
>>> > 1 + numpy.argmax(numpy.random.multinomial(1, [0.1, 0.2, 0.7]))
>>> >
>>> > For your "real" application you'll probably want to use a value >1 for
>>> > the
>>> > first parameter (equal to your sample size), instead of calling it
>>> > multiple
>>> > times.
>>> >
>>> > -=- Olivier
>>>
>>> Thanks. Warren (Weckesser) mentioned this possibility to me yesterday
>>> and I forgot to put it in my post. I assume you mean something like
>>>
>>> x = np.arange(3)
>>> y = np.random.multinomial(30, [0.1,0.2,0.7])
>>> z = np.repeat(x, y)
>>> np.random.shuffle(z)
>>>
>>> That look right?
>>>
>>> -Chris JS
>>>
>>
>> Yes, exactly.
>
> Chuck's answer to the same question, when I asked on the list, used
> searchsorted and is fast
>
> cdfvalues.searchsorted(np.random.random(size))
>
> my recent version of it for FiniteLatticeDistribution
>
>    def rvs(self, size=1):
>        '''draw random variables with shape given by size
>
>        '''
>        #w = self.pdfvalues
>        #p = cumsum(w)/float(w.sum())
>        #p.searchsorted(np.random.random(size))
>        return self.support[self.cdfvalues.searchsorted(np.random.random(size))]
>
> Josef
>

That's exactly what I needed. Thanks!

-Chris JS

>
>>
>> -=- Olivier
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list