[Matrix-SIG] repeated transposes
Michael Haggerty
mhagger@blizzard.harvard.edu
Mon, 10 Aug 1998 20:02:49 -0400
Hi,
There is a feature of the Numeric module that is somewhat
inconvenient, and can adversely affect performance. Namely, that
Numeric doesn't realize that a matrix can always be transposed without
copying data, and that a matrix that has been transposed twice is back
in `contiguous' storage format.
>>> M = array([[1,2],[3,4]])
>>> Numeric.transpose(Numeric.transpose(M)).iscontiguous()
0
Why is this case important? Often Numeric is used to interface to
large C or Fortran matrices. But Fortran stores matrices column-wise,
whereas Numeric stores them (by default) row-wise. No problem:
whenever you receive a matrix from Fortran, you transpose() it; and
before you pass any matrix back to Fortran, you transpose() it again.
But since Numeric isn't fastidious about keeping track of when a
matrix is stored contiguously, the double transpose causes the matrix
to be copied two extra times (once on entry to the second transpose
operation, which copies it into the *wrong* order; a second time when
putting it back into Fortran order.
But there is no need for transpose ever to copy data--transposes can
always be done by adjusting dimensions and strides--and in fact, the
current algorithm is general enough to work with non-contiguous
matrices. The only trick is to get it to realize when it has just put
a matrix *back* into contiguous order and set the CONTIGUOUS bit back
on.
The following patch to multiarraymodule.c does that. With this patch,
>>> m = array([[1,2,3],[4,5,6]])
>>> m
array([[1, 2, 3],
[4, 5, 6]])
>>> n = transpose(m)
>>> n.iscontiguous()
0
>>> o = transpose(n)
>>> o
array([[1, 2, 3],
[4, 5, 6]])
>>> o.iscontiguous()
1
>>> o[1,0] = 0
>>> o
array([[1, 2, 3],
[0, 5, 6]])
>>> n
array([[1, 0],
[2, 5],
[3, 6]])
>>> m
array([[1, 2, 3],
[0, 5, 6]])
So you can see that now all three arrays refer to the same memory, and
that o is recognized to be contiguous.
There might be other Numeric algorithms that can be implemented in a
way that they can deal with noncontiguous data. This way, the copying
of large matrices can be avoided as much as possible. The function
provided in the patch below, `array_really_contiguous()', might be
useful for that purpose.
See patch below...
Yours,
Michael
--
Michael Haggerty
mhagger@blizzard.harvard.edu
========================= Patch ==============================
*** ../../../LLNLPython4/Numerical/Src/multiarraymodule.c Fri Jun 19 11:27:18 1998
--- multiarraymodule.c Mon Aug 10 19:10:04 1998
***************
*** 115,120 ****
--- 115,136 ----
return NULL;
}
+ /* Check whether the given array is stored contiguously (row-wise) in
+ memory. */
+ static int array_really_contiguous(PyArrayObject *ap) {
+ int sd;
+ int i;
+
+ sd = ap->descr->elsize;
+ for (i = ap->nd-1; i >= 0; --i) {
+ if (ap->dimensions[i] == 0) return 1; /* contiguous by definition */
+ if (ap->strides[i] != sd) return 0;
+ sd *= ap->dimensions[i];
+ }
+ return 1;
+ }
+
+ /* Changed to be able to deal with non-contiguous arrays. */
extern PyObject *PyArray_Transpose(PyArrayObject *ap, PyObject *op) {
long *axes, axis;
int i, n;
***************
*** 135,145 ****
--- 151,164 ----
permutation[i] = axis;
}
+ /* this allocates memory for dimensions and strides (but fills them
+ incorrectly), sets up descr, and points data at ap->data. */
ret = (PyArrayObject *)PyArray_FromDimsAndData(n, permutation,
ap->descr->type_num,
ap->data);
if (ret == NULL) goto fail;
+ /* point at true owner of memory: */
ret->base = (PyObject *)ap;
Py_INCREF(ap);
***************
*** 147,154 ****
ret->dimensions[i] = ap->dimensions[permutation[i]];
ret->strides[i] = ap->strides[permutation[i]];
}
! ret->flags &= ~CONTIGUOUS;
!
PyArray_Free(op, (char *)axes);
free(permutation);
return (PyObject *)ret;
--- 166,174 ----
ret->dimensions[i] = ap->dimensions[permutation[i]];
ret->strides[i] = ap->strides[permutation[i]];
}
! if (array_really_contiguous(ret)) ret->flags |= CONTIGUOUS;
! else ret->flags &= ~CONTIGUOUS;
!
PyArray_Free(op, (char *)axes);
free(permutation);
return (PyObject *)ret;
***************
*** 969,977 ****
PyArrayObject *a;
if (PyArg_ParseTuple(args, "OO", &a0, &shape) == NULL) return NULL;
! if ((a = (PyArrayObject *)PyArray_ContiguousFromObject(a0,
! PyArray_NOTYPE,0,0))
! ==NULL) return NULL;
ret = PyArray_Transpose(a, shape);
Py_DECREF(a);
--- 989,997 ----
PyArrayObject *a;
if (PyArg_ParseTuple(args, "OO", &a0, &shape) == NULL) return NULL;
! if ((a = (PyArrayObject *)PyArray_FromObject(a0,
! PyArray_NOTYPE,0,0))
! ==NULL) return NULL;
ret = PyArray_Transpose(a, shape);
Py_DECREF(a);