[Numpy-discussion] Example code for Numpy C preprocessor 'repeat' directive?

Wed Mar 4 20:46:51 EST 2009

On Wed, Mar 4, 2009 at 5:54 PM, Stephen Simmons <mail at stevesimmons.com>wrote:

> Hi,
>
> Please can someone suggest resources for learning how to use the
> 'repeat' macros in numpy C code to avoid repeating sections of
> type-specific code for each data type? Ideally there would be two types
> of resources: (i) a description of how the repeat macros are meant to be
> used/compiled; and (ii) suggestion for a numpy source file that best
> illustrates this.
>
> Thanks in advance!
> Stephen
>
> P.S.  Motivation is this is I'm trying to write an optimised numpy
> implementation of SQL-style aggregation operators for an OLAP data
> analysis project (using PyTables to store large numpy data sets).
> bincount()  is being used to implement "SELECT SUM(x) FROM TBL WHERE y
> GROUP BY fn(z)". My modified bincount code can handle a wider variety of
> index, weight and output array data types. It also supports passing in
> the output array as a parameter, allowing multipass aggregation routines.
>
> I got the code working for a small number of data type combinations, but
> now I'm drowning in an exponential explosion of manually maintained data
> type combinations
> ---snip----
>    } else if ((weight_type==NPY_FLOAT)&&(out_type==PyArray_DOUBLE)) {
> ...
>        } else if (bin_type==PyArray_INTP) {
>            for (i=0; i<bin_len; i++) {
>                bin = ((npy_intp *) bin_data)[i];
>            if (bin>=0 && bin<=max_bin)
>                ((double*)out_data)[bin] += *((float *)(weights_data +
> i*wt_stride));
>            }
>        } else if (bin_type==PyArray_UINT8) {
>            for (i=0; i<bin_len; i++) {
>                bin = ((npy_uint8 *) bin_data)[i];
>            if (bin>=0 && bin<=max_bin)
>                ((double*)out_data)[bin] += *((float *)(weights_data +
> i*wt_stride));
>            }
> ---snip----
>
> 'repeat' directives in C comments are obviously the proper way to avoid
> manual generating all this boilerplate code. Unfortunately I haven't yet
> understood how to make the autogenerated type-specific code link back
> into a comment function entry point. Either there is some
> compiler/distutils magic going on, or it's explained in a different
> numpy source file from where I'm looking right now...
> _

Are you referring to example code like the following?

/**begin repeat
 * #type = a, b, c#
 */

void func at type@(void) {}

/**end repeat**/

Templated code like that is preprocessed by calling process_file or
process_str in the module /numpy/numpy/distutils/conv_template.py. There is
a small amount of documentation in that file. You can also use the command
line like so:

python conv_template.py file.xxx.src

Of course, conv_template.py needs to be in your path for that to work. The
templating facility provided by conv_template is pretty basic but adequate
for numpy. There are other template systems floating around that might also
serve your needs.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20090304/014f3c24/attachment.html>