[SciPy-User] SciPy-User Digest, Vol 120, Issue 19

Wed Aug 28 05:05:02 EDT 2013

On Wed, Aug 28, 2013 at 8:34 AM, Johannes Radinger <
johannesradinger at gmail.com> wrote:
>
> Thank you Robert, that sounds like a very straight forward solution to
that task.
> The working example following the given data can look like:
>
> np.random.multinomial(300, y.flat, size=1).reshape(y.shape)
>
>
> However, I had a similar discussion a while ago already in stackoverflow
where a different solution has been proposed:
>
> np.bincount(np.searchsorted(np.cumsum(y), np.random.random(300)),
minlength=y.size).reshape(y.size)
>
> respectively for numpy > 1.7
> np.bincount(np.searchsorted(np.random.choice(y.size, 300, p=y.flat),
minlength=y.size).reshape(y.size)
>
> So what I am wondering is what is the actual difference in the meaning
behind both approaches?
> Do they actually provide results based on a totally different meaning of
"weighted random"? And what are the consequences then? As am not really
familiar with statistics maybe someone can clarify that?

First, if you would like to participate in threads, we would really
appreciate it if you subscribe to the mailing list normally instead of
using the digest. If you feel you must use the digest, please trim your
replies and adjust the Subject: line. Thanks.

As for the different approaches, they only minor benefit of the second
approach is that it does not need normalized weights. Otherwise, all of the
approaches are sampling the same thing, just rather less efficiently and
clearly than multinomial() does.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130828/71b8b850/attachment.html>