[Numpy-discussion] API tricks any my new "clip" function

Wed Aug 29 17:10:19 EDT 2001

Hi all,

I wrote last week looking for hints for writing a new clip function,
that would be faster than the existing one. I got some suggestions, and
have written the function. It works, but I'd love to get a little code
review from the more experienced folks here. I have a few specific
questions:

A) Are there any tricks to making a function work with multiple types? I
currently have it working with only double arrays, but it would be great
for it ot be universal (and it could then be added to the main NumPy
distribution, I suppose) I seem tohave to typecast the data to do the
comparison I want, so I can't figure out how to make it general. For
example, I have this line:

if (*(double *)(Array->data + i*elementsize) > *(double *)(Max->data +
i*elementsize))
{
          *(double *)(Array->data + i*elementsize) = *(double
*)(Max->data + i*elementsize) ;
}

How could I do that without the explicit typecasts? I'm pretty sure I
could make the rest of the routine general.

B) I was trying to figure out how to loop through all the elements of a
non-contiguous array. I got two hints:
"Rob W. W. Hooft" wrote:

> Never done this in C for numpy yet, but I normally program this
> along the following lines:
> 
> n=[2,3,2,3]
> x=[0,0,0,0]
> while 1:
>     print x
>     i=len(n)-1
>     while i>=0 and x[i]+1==n[i]:
>        x[i]=0
>        i=i-1
>     if i<0:
>        break
>     x[i]=x[i]+1
> 
> It is always going to be slower than the straight loop for contiguous
> arrays. 

This is what I did, and it is a LOT slower (about a factor of ten). See
the enclosed C code. There has got to be a way to optimize this!

>So if you want to do it properly, you'd have to check whether
> the array is contiguous, and if it is, use the fast way.

Since I had already written code for the contiguous case, it was easy to
special case it.

Pete Shinners wrote:
> it's easiest to use recursion than nested loops. then you can
> support any number of dimensions. that should point you in the
> right direction :]

When I read this, I thought: "well, duh!", but then I couldn't figure
out how to do it. Would there be any advantage over the above loops
anyway?

Is there either a more elgant or faster way to loop through all the
elements of these three arrays?

C) Where is the DL_EXPORT macro defined? I havn't been able to find it
with grep yet. Does it work on all NumPy supported platforms?

D) Is there an exising function or Macro for checking if two arrays are
the same shape? I wrote one, but I imagine it exists.

F) I needed to work with input that could be any Python number, or an
Array of the right size. What I did was call: PyArray_FromObject on the
min and max inputs, so that I would get a PyArray, then I could check if
it was a rank-o array, or the right size. I havn't been able to decide
if this is a nifty trcik, or a kludge. Any other suggestions?

E) Have I got reference counting right? I tried to Py_DECREF all my temp
arrays, but I'm not sure I did it right.

Sorry to enclose the code if you have no interest in it, but it's not
really that big.

-Chris

-- 
Christopher Barker,
Ph.D.                                                           
ChrisHBarker at home.net                 ---           ---           ---
http://members.home.net/barkerlohmann ---@@       -----@@       -----@@
                                   ------@@@     ------@@@     ------@@@
Oil Spill Modeling                ------   @    ------   @   ------   @
Water Resources Engineering       -------      ---------     --------    
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------
-------------- next part --------------
#include "Python.h"
#include "arrayobject.h"

// These are little macros I use to access array values for specific rank arrays
#define ARRAYVAL0(aType,a) ( *(aType *)(a->data))	
#define ARRAYVAL1(aType,a,i) ( *(aType *)(a->data + (i)*a->strides[0]))	
#define ARRAYVAL2(aType,a,i,j) ( *(aType *)(a->data + (i)*a->strides[0] + (j)*a->strides[1]))	
#define ARRAYVAL3(aType,a,i,j,k) ( *(aType *)(a->data + (i)*a->strides[0] + (j)*a->strides[1] + (k)*a->strides[2]))	

int AreSameShape(PyArrayObject *Array1, PyArrayObject *Array2){

  int i;

  if (Array1->nd != Array2->nd){
    return 0;
  }
  for (i = 0; i < Array1->nd; i++){
    if (Array1->dimensions[i] != Array2->dimensions[i]){
      return 0;
    }
  }
  return 1;
}

int ComputeOffset(PyArrayObject *Array, int *indexes){

  int i, offset = 0;

  for (i = 0; i < Array->nd; i++)
    {
      offset += indexes[i]*Array->strides[i];
    }
  return offset;
}

void ReplaceValue(PyArrayObject *Array, PyArrayObject *Min, PyArrayObject *Max, int *indexes){
  // This function does the actual replacement of the values.
  // it allows Min and Max to be one element, or the same shape as Array
  // the offsets are computed separately, so as long as they are the same shape
  // it should work even if some are contiguous, and some not, for example

  double min, max;

  int offset;

  offset = ComputeOffset(Array,indexes);

  if (PyArray_Size((PyObject *)Min) == 1){
    min = ARRAYVAL0(double,Min);
  }
  else {
    min = *(double *)(Min->data + ComputeOffset(Min,indexes));
  }
  if (PyArray_Size((PyObject *)Max) == 1){
    max = ARRAYVAL0(double,Max);
  }
  else {
    max = *(double *)(Max->data + ComputeOffset(Max,indexes));
  }
  if (*(double *)(Array->data + offset) > max){
    *(double *)(Array->data + offset) = max ;
  }
  else if (*(double *)(Array->data + offset) < min){
    *(double *)(Array->data + offset) = min;
  }
  return;
}

static char doc_fastclip[] =
"fastclip(a,min,max):  a is clipped so that there are no values \n"
"less than min, or greater than max.\n\n"
"It only works on PyArrays of type Float.\n"
"min and max can be either scalar or the same size as a.\n"
"If a, min, and max are all contiguous (or scalar) then it is a LOT faster\n";

static PyObject * NumericExtras_fastclip(PyObject *self, PyObject *args)
{

  int NumArray, elementsize, i;
  int *indexes;

  PyObject *InMin;
  PyObject *InMax;

  PyArrayObject *Array;
  PyArrayObject *Min;
  PyArrayObject *Max;

  if (!PyArg_ParseTuple(args, "O!OO", 
						&PyArray_Type, &Array,
					    &InMin,
						&InMax))
    {
      return NULL;
    }  

  // check types of input

  if (Array->descr->type_num != PyArray_DOUBLE){
	PyErr_SetString (PyExc_ValueError,
					 "a must be a NumPy array of type Float");
    return NULL;
  }

  // convert min and max to double arrays:
  // if they don't convert, the input values are not valid
  // also check if they are the right size
  Min = (PyArrayObject *) PyArray_FromObject(InMin, PyArray_DOUBLE, 0, 0);
  if (Min == NULL){
    PyErr_SetString (PyExc_ValueError,
					 "min must be an object that can be converted to an array of Floats");
    return NULL;
  }

  if (!((PyArray_Size((PyObject *)Min) == 1) ||  AreSameShape(Min, Array))){
    PyErr_SetString (PyExc_ValueError,
					 "min must be either a scalar or the same size as a");
    Py_DECREF(Min);
    return NULL;
  }

  Max = (PyArrayObject *) PyArray_FromObject(InMax, PyArray_DOUBLE, 0, 0);
  if (Max == NULL){
    PyErr_SetString (PyExc_ValueError,
					 "max  must be an object that can be converted to an array of Floats");
    Py_DECREF(Min);
    return NULL;
  }

  if (!((PyArray_Size((PyObject *)Max) == 1) ||  AreSameShape(Max, Array))){
    PyErr_SetString (PyExc_ValueError,
					 "max must be either a scalar or the same size as a");
    Py_DECREF(Min);
    Py_DECREF(Max);
    return NULL;
  }

  // After all that input checking, switch between the contiguous and non-contiguous cases.
  if (PyArray_ISCONTIGUOUS(Array) && PyArray_ISCONTIGUOUS(Min) && PyArray_ISCONTIGUOUS(Max)){ //we have the contiguous case

    // This seems pretty kludgy to have the four cases, but it was the first
    // thing that came to mind that I was sure would work, and would accomidate
    // either array or scalar arguments for min and max. 
    // it's also faster than checking if they are scalar inside the loop

    NumArray = PyArray_Size((PyObject *)Array);
    elementsize = sizeof(double);

    if (PyArray_Size((PyObject *)Min) == 1 && PyArray_Size((PyObject *)Max) == 1){ // both limits are scalar
      for (i = 0; i < NumArray; i++) { // loop over each element
        if (*(double *)(Array->data + i*elementsize) > ARRAYVAL0(double,Max)){
          *(double *)(Array->data + i*elementsize) = ARRAYVAL0(double,Max);
        }
        else if (*(double *)(Array->data + i*elementsize) < ARRAYVAL0(double,Min)){
          *(double *)(Array->data + i*elementsize) = ARRAYVAL0(double,Min);
        }
      }
    }
    else if (PyArray_Size((PyObject *)Min) == 1 && PyArray_Size((PyObject *)Max) == NumArray){ // Min is scalar
      for (i = 0; i < NumArray; i++) { // loop over each element
        if (*(double *)(Array->data + i*elementsize) > *(double *)(Max->data + i*elementsize)){
          *(double *)(Array->data + i*elementsize) = *(double *)(Max->data + i*elementsize) ;
        }
        else if (*(double *)(Array->data + i*elementsize) < ARRAYVAL0(double,Min)){
          *(double *)(Array->data + i*elementsize) = ARRAYVAL0(double,Min);
        }
      }
    }    
    else if (PyArray_Size((PyObject *)Max) == 1 && PyArray_Size((PyObject *)Min) == NumArray){ // Max is scalar
      for (i = 0; i < NumArray; i++) { // loop over each element
        if (*(double *)(Array->data + i*elementsize) > ARRAYVAL0(double,Max)){
          *(double *)(Array->data + i*elementsize) = ARRAYVAL0(double,Max);
        }
        else if (*(double *)(Array->data + i*elementsize) < *(double *)(Min->data + i*elementsize)){
          *(double *)(Array->data + i*elementsize) = *(double *)(Min->data + i*elementsize) ;
        }
      }
    }    
    else if (PyArray_Size((PyObject *)Min) == NumArray && PyArray_Size((PyObject *)Max) == NumArray){ // Neither is scalar
      for (i = 0; i < NumArray; i++) { // loop over each element
        if (*(double *)(Array->data + i*elementsize) > *(double *)(Max->data + i*elementsize)){
          *(double *)(Array->data + i*elementsize) = *(double *)(Max->data + i*elementsize) ;
        }
        else if (*(double *)(Array->data + i*elementsize) < *(double *)(Min->data + i*elementsize)){
          *(double *)(Array->data + i*elementsize) = *(double *)(Min->data + i*elementsize) ;
        }
      }    
    }
    else { // The size of Min and/or Max don't match Array: we should never get here, do to previous checks.
      PyErr_SetString (PyExc_ValueError,
                       "min and max must be either scalar or the same shape as a");
      Py_DECREF(Min);
      Py_DECREF(Max);
      return NULL;
    }

  }
  else { // this now the non-contiguous case: it is much slower, but presumably could be optimised : suggestions gratefully excepted!

    indexes = calloc(Array->nd,sizeof(int));

    while (1){

      ReplaceValue(Array, Min, Max, indexes);

      i = (Array->nd) - 1;
      while ((i >= 0) && (indexes[i]+1 == Array->dimensions[i])){
        indexes[i] = 0;
        i -= 1;
      }
      if (i<0) {
        break;
      }
      indexes[i] = indexes[i]+1;
    } 

    free(indexes);
  }    
  Py_DECREF(Min);
  Py_DECREF(Max);
  return Py_None;
}

static PyMethodDef NumericExtrasMethods[] = {
  {"fastclip", NumericExtras_fastclip, METH_VARARGS, doc_fastclip},
  {NULL, NULL} /* Sentinel */
};

DL_EXPORT (void) initNumericExtras(){

  (void) Py_InitModule("NumericExtras", NumericExtrasMethods);
  import_array()
}