[Matrix-SIG] repeated transposes

Michael Haggerty mhagger@blizzard.harvard.edu
Mon, 10 Aug 1998 20:02:49 -0400


Hi,

There is a feature of the Numeric module that is somewhat
inconvenient, and can adversely affect performance.  Namely, that
Numeric doesn't realize that a matrix can always be transposed without
copying data, and that a matrix that has been transposed twice is back
in `contiguous' storage format.

>>> M = array([[1,2],[3,4]])
>>> Numeric.transpose(Numeric.transpose(M)).iscontiguous()
0

Why is this case important?  Often Numeric is used to interface to
large C or Fortran matrices.  But Fortran stores matrices column-wise,
whereas Numeric stores them (by default) row-wise.  No problem:
whenever you receive a matrix from Fortran, you transpose() it; and
before you pass any matrix back to Fortran, you transpose() it again.

But since Numeric isn't fastidious about keeping track of when a
matrix is stored contiguously, the double transpose causes the matrix
to be copied two extra times (once on entry to the second transpose
operation, which copies it into the *wrong* order; a second time when
putting it back into Fortran order.

But there is no need for transpose ever to copy data--transposes can
always be done by adjusting dimensions and strides--and in fact, the
current algorithm is general enough to work with non-contiguous
matrices.  The only trick is to get it to realize when it has just put
a matrix *back* into contiguous order and set the CONTIGUOUS bit back
on.

The following patch to multiarraymodule.c does that.  With this patch,

>>> m = array([[1,2,3],[4,5,6]])
>>> m
array([[1, 2, 3],
       [4, 5, 6]])
>>> n = transpose(m)
>>> n.iscontiguous()
0
>>> o = transpose(n)
>>> o
array([[1, 2, 3],
       [4, 5, 6]])
>>> o.iscontiguous()
1
>>> o[1,0] = 0
>>> o
array([[1, 2, 3],
       [0, 5, 6]])
>>> n
array([[1, 0],
       [2, 5],
       [3, 6]])
>>> m
array([[1, 2, 3],
       [0, 5, 6]])

So you can see that now all three arrays refer to the same memory, and
that o is recognized to be contiguous.

There might be other Numeric algorithms that can be implemented in a
way that they can deal with noncontiguous data.  This way, the copying
of large matrices can be avoided as much as possible.  The function
provided in the patch below, `array_really_contiguous()', might be
useful for that purpose.

See patch below...

Yours,
Michael

--
Michael Haggerty
mhagger@blizzard.harvard.edu

========================= Patch ==============================

*** ../../../LLNLPython4/Numerical/Src/multiarraymodule.c	Fri Jun 19 11:27:18 1998
--- multiarraymodule.c	Mon Aug 10 19:10:04 1998
***************
*** 115,120 ****
--- 115,136 ----
  	return NULL;
  }
  
+ /* Check whether the given array is stored contiguously (row-wise) in
+    memory. */
+ static int array_really_contiguous(PyArrayObject *ap) {
+ 	int sd;
+ 	int i;
+ 
+ 	sd = ap->descr->elsize;
+ 	for (i = ap->nd-1; i >= 0; --i) {
+ 		if (ap->dimensions[i] == 0) return 1; /* contiguous by definition */
+ 		if (ap->strides[i] != sd) return 0;
+ 		sd *= ap->dimensions[i];
+ 	}
+ 	return 1;
+ }
+ 
+ /* Changed to be able to deal with non-contiguous arrays. */
  extern PyObject *PyArray_Transpose(PyArrayObject *ap, PyObject *op) {
  	long *axes, axis;
  	int i, n;
***************
*** 135,145 ****
--- 151,164 ----
  		permutation[i] = axis;
  	}
  	
+ 	/* this allocates memory for dimensions and strides (but fills them
+ 	   incorrectly), sets up descr, and points data at ap->data. */
  	ret = (PyArrayObject *)PyArray_FromDimsAndData(n, permutation,
  		ap->descr->type_num,
  		ap->data);
  	if (ret == NULL) goto fail;
  	
+ 	/* point at true owner of memory: */
  	ret->base = (PyObject *)ap;
  	Py_INCREF(ap);
  	
***************
*** 147,154 ****
  		ret->dimensions[i] = ap->dimensions[permutation[i]];
  		ret->strides[i] = ap->strides[permutation[i]];
  	}
! 	ret->flags &= ~CONTIGUOUS;
! 	
  	PyArray_Free(op, (char *)axes);
  	free(permutation);
  	return (PyObject *)ret;
--- 166,174 ----
  		ret->dimensions[i] = ap->dimensions[permutation[i]];
  		ret->strides[i] = ap->strides[permutation[i]];
  	}
! 	if (array_really_contiguous(ret)) ret->flags |= CONTIGUOUS;
! 	else ret->flags &= ~CONTIGUOUS;
! 
  	PyArray_Free(op, (char *)axes);
  	free(permutation);
  	return (PyObject *)ret;
***************
*** 969,977 ****
  	PyArrayObject *a;
  	
  	if (PyArg_ParseTuple(args, "OO", &a0, &shape) == NULL) return NULL;
! 	if ((a = (PyArrayObject *)PyArray_ContiguousFromObject(a0,
! 							 PyArray_NOTYPE,0,0))
! 							 ==NULL) return NULL;
  	
  	ret = PyArray_Transpose(a, shape);
  	Py_DECREF(a);
--- 989,997 ----
  	PyArrayObject *a;
  	
  	if (PyArg_ParseTuple(args, "OO", &a0, &shape) == NULL) return NULL;
! 	if ((a = (PyArrayObject *)PyArray_FromObject(a0,
! 												 PyArray_NOTYPE,0,0))
! 		==NULL) return NULL;
  	
  	ret = PyArray_Transpose(a, shape);
  	Py_DECREF(a);