[Numpy-discussion] Example code for Numpy C preprocessor 'repeat' directive?

Stephen Simmons mail at stevesimmons.com
Wed Mar 4 19:54:23 EST 2009


Hi,

Please can someone suggest resources for learning how to use the 
'repeat' macros in numpy C code to avoid repeating sections of 
type-specific code for each data type? Ideally there would be two types 
of resources: (i) a description of how the repeat macros are meant to be 
used/compiled; and (ii) suggestion for a numpy source file that best 
illustrates this.

Thanks in advance!
Stephen

P.S.  Motivation is this is I'm trying to write an optimised numpy 
implementation of SQL-style aggregation operators for an OLAP data 
analysis project (using PyTables to store large numpy data sets). 
bincount()  is being used to implement "SELECT SUM(x) FROM TBL WHERE y 
GROUP BY fn(z)". My modified bincount code can handle a wider variety of 
index, weight and output array data types. It also supports passing in 
the output array as a parameter, allowing multipass aggregation routines.

I got the code working for a small number of data type combinations, but 
now I'm drowning in an exponential explosion of manually maintained data 
type combinations
---snip----
    } else if ((weight_type==NPY_FLOAT)&&(out_type==PyArray_DOUBLE)) {
...
        } else if (bin_type==PyArray_INTP) {
            for (i=0; i<bin_len; i++) {
                bin = ((npy_intp *) bin_data)[i];
            if (bin>=0 && bin<=max_bin)
                ((double*)out_data)[bin] += *((float *)(weights_data + 
i*wt_stride));
            }
        } else if (bin_type==PyArray_UINT8) {
            for (i=0; i<bin_len; i++) {
                bin = ((npy_uint8 *) bin_data)[i];
            if (bin>=0 && bin<=max_bin)
                ((double*)out_data)[bin] += *((float *)(weights_data + 
i*wt_stride));
            }
---snip----

'repeat' directives in C comments are obviously the proper way to avoid 
manual generating all this boilerplate code. Unfortunately I haven't yet 
understood how to make the autogenerated type-specific code link back 
into a comment function entry point. Either there is some 
compiler/distutils magic going on, or it's explained in a different 
numpy source file from where I'm looking right now...



More information about the NumPy-Discussion mailing list