[SciPy-User] Semi-OT: weekend puzzle enumerating number of balls in bins

Mon Jan 13 01:17:41 EST 2014

Hi Josef:

This recursive solution looks faster:
http://nbviewer.ipython.org/urls/raw2.github.com/guziy/PyNotebooks/master/bins.ipynb?create=1

I think you can optimize it to get rid of the loops using numpy functions,
I did not do it since you probably know them better))

Cheers

2014/1/12 <josef.pktd at gmail.com>

> nobs = 5
> rr = 2
>
> We have 5 balls and 2 bins.
> We can put as many balls as we want (0 to 5) in the first bin.
> Then we put as many balls as we want (and are available) in the second bin.
> And so on if we had more bins.
>
> I'd like to enumerate all possible bin allocations, in this case 21:
>
> >>> m_all
> array([[0, 0],
>        [0, 1],
>        [0, 2],
>        [0, 3],
>        [0, 4],
>        [0, 5],
>        [1, 0],
>        [1, 1],
>        [1, 2],
>        [1, 3],
>        [1, 4],
>        [2, 0],
>        [2, 1],
>        [2, 2],
>        [2, 3],
>        [3, 0],
>        [3, 1],
>        [3, 2],
>        [4, 0],
>        [4, 1],
>        [5, 0]])
>
>
> Here is a simple way to caculate it. The number of cases gets large very
> fast.
> -----------
> import itertools
> import numpy as np
>
> nobs = 10
> rr = 3
> m_all = np.array(list(itertools.ifilter(lambda ii: sum(ii)<=nobs,
>                                  itertools.product(*[range(nobs +
> 1)]*rr))))
> print m_all.shape
> print np.bincount(m_all.sum(1))
> -----------
>
> Is there a faster way to do this?
>
> I need to do some calculation on each case, so either a numpy array or
> an iterator works.
>
>
>
> The motivation:
>
> We have two kinds of machines, 10 each that we want to test for failing
> early.
> The test has the null hypothesis that the failure distributions are the
> same.
> Against a one-sided alternative that the first kind of machines fails
> earlier.
>
> We run the 20 machines and count how many machines of the first kind
> has failed by the time that we see the rth (3rd) failure of the second
> kind.
> If a large number of the machines of the first kind have failed, then
> we reject the Null that they have the same failure distribution.
>
> We don't have time to wait until all machines fail. So we need to do
> "very small" sample statistics.
>
> Wilcoxon-type precedence tests.
>
>
> Josef
> It's Big Data, if we have big machines.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

-- 
Sasha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20140113/ba106c13/attachment.html>