[Numpy-discussion] Extracting all the possible combinations of a grid

Charles R Harris charlesr.harris at gmail.com
Sat Sep 22 11:58:33 EDT 2007


On 9/22/07, Gael Varoquaux <gael.varoquaux at normalesup.org> wrote:
>
> On Sat, Sep 22, 2007 at 10:35:16AM +0200, Gael Varoquaux wrote:
> > I would go for the "generate_fourplets" solution if I had a way to
> > calculate the binomial coefficient without overflows.
>
> Sorry, premature optimisation is the root of all evil, but turning ones
> brain on early is good.
>
> """
>
> ##############################################################################
> # Some routines for calculation of binomial coefficients
> def gcd(m,n):
>     while n:
>         m,n=n,m%n
>     return m
>
> def binom_(n,k):
>         if k==0:
>             return 1
>         else:
>             g = gcd(n,k)
>             return binomial(n-1, k-1)/(k/g)*(n/g)
>
> def binomial(n,k):
>     if k > n/2: # Limit recursion depth
>         k=n-k
>     return binom_(n,k)
> """
>
> This is surprisingly fast (surprising for me, at least).
>
> Using this and the C code I have, I can generate the quadruplets of 200
> integers quite quickly:
>
> In [5]: %timeit b = [1 for i in  generate_quadruplets(200)]
> 10 loops, best of 3: 1.61 s per loop
>
> With generate_quadruplets given by:
>
> """
>
> ##############################################################################
> def generate_quadruplets(size):
>     """ Returns an iterator on tables listing all the possible unique
>     combinations of four integers below size. """
>
>     C_code = """
>     int index = 0;
>     for (int j=0; j<i+1; j++) {
>         for (int k=0; k<j+1; k++) {
>             for (int l=0; l<k+1; l++) {
>                 quadruplets(index, 0) = i;
>                 quadruplets(index, 1) = j;
>                 quadruplets(index, 2) = k;
>                 quadruplets(index, 3) = l;
>                 index++ ;
>             }
>         }
>     }
>     """
>
>     for i in xrange(size):
>         multiset_coef = binomial(i+3, 3)
>         quadruplets = empty((multiset_coef, 4), dtype=uint32)
>         inline(C_code, ['quadruplets', 'i'],
>                                 type_converters=converters.blitz)
>
>         yield quadruplets
> """


Umm... that doesn't look quite right. Shouldn't it be something like

def generate_quadruplets(size):
    """ Returns an iterator on tables listing all the possible unique
    combinations of four integers below size. """

    C_code = """
    int index = 0;
    for (int j=2; j<i; j++) {
        for (int k=1; k<j; k++) {
            for (int l=0; l<k; l++) {
                quadruplets(index, 0) = i;
                quadruplets(index, 1) = j;
                quadruplets(index, 2) = k;
                quadruplets(index, 3) = l;
                index++ ;
            }
        }
    }
    """

    for i in xrange(3,size):
        multiset_coef = binomial(i, 3)
        quadruplets = empty((multiset_coef, 4), dtype=uint32)
        inline(C_code, ['quadruplets', 'i'],
                                type_converters=converters.blitz)

        yield quadruplets

This fits my needs.


Algorithm L can be chunked pretty easily also:

def combination(n,t,chunk) :
    c = arange(t + 2)
    c[-1] = 0
    c[-2] = n
    out = empty((chunk,t),dtype=int32)
    while 1 :
        for i in xrange(chunk) :
            out[i] = c[:t]
            j = 0
            while c[j] + 1 == c[j+1] :
                c[j] = j
                j += 1
            if j >= t :
                break
            c[j] += 1
        yield out[:i+1]
        if j >= t :
            return

I think this would go well as a C++ function object. The python wrapper
would look something like

def combination(n,t,chunk) :
    next = cpp_combination(n,t,chunk)
    while 1 :
        out = next()
        if len(out) > 0 :
            yield out
        else :
            return

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070922/21edc8b3/attachment.html>


More information about the NumPy-Discussion mailing list