[PYTHON MATRIX-SIG] Let's get going

Jim Fulton, U.S. Geological Survey jfulton@usgs.gov
Tue, 12 Sep 1995 15:21:49 -0400


On Tue, 12 Sep 1995 00:46:50 -0700 
Graham Hughes said:
> At 07:04 PM 9/11/95 -0400, you wrote:
> >   o I wanted a python built-in data type containing a data structure
> >     that I could pass directly to Fortran or C routines.  I did not
> >     want to have to do alot of data conversion or copying on each
> >     call.  For example, if a Fortran routine returns a matrix, then
> >     it should be possible to pass this matrix to another Fortran
> >     routine without reallocating a data structure or converting the
> >     entire matrix. 
> 
> Instant problem right there. To the best of my knowledge (and I've read this
> in several books), Fortran and C use completely different ways of storing
> multidimensional arrays. To wit: given the array m, like this:
> 
>         3       4       5       6
> 
>         7       8       9       0
> 
>         3       -2      4       1
> 
> There are two ways of storing this array in a continuous memory location; by
> rows or by columns. By rows looks like this: '3 4 5 6 7 8 9 0 3 -2 4 1', and
> is used by both C and Pascal. By columns looks like this: '3 7 3 4 8 -2 5 9
> 4 6 0 1', and is used by Fortran. In other words, to pass to either Fortran
> or C functions you're going to need to know which type of function you have,
> and then convert the internal representation.
> 
> This can be done, but it greatly complicates things.

Not really.  Fortran stores data in multidimensional arrays with the
indexes changing most rapidly in the left.  Some people give this the
interpretation that Fortran stores data "by column".

C multi-dimensional arrays are stored with the indexes changing most
rapidly on the right.  Some people give this the interpretation that C
stores data "by row".  

The difference is mainly one of interpretation.  Both C and Fortran
store mult-dimensional data pretty much the same way, as a big
homogenous "rectangulish" glob of data.  They differ only in the way
they order the indexes to this data, and this ordering is only
important for a certain set of interpretations.

I choose to interpret C and Python multidimensional indexes,
especially in the context of matrices, in a way in which this
difference in indexing does not present a problem. I think of
n-dimensional matrices as sequences of n-1-dimensional matrices. That
is, matrices are stored, "by sub-matrix".  Now, in the case of 2-d
matrices, I was taught to think of a matrix as a collection of column
vectors.  With this interpretation, C and python 2-d matrices are,
indeed, stored by column.

Consider m[i][j].  Since 2-d matrices are a collection of column
vectors, m[i] is the ith column.  That is, we give the column index
*first*, and then the row index.  With this interpretation, there is
no difference between C and Fortran storage formats.

(snip)

> 
> >> **Instantiation**
> >> 
> >> The following function will be used to instantiate a matrix:
> >> Matrix(datasource(optional), d1, d2, ..., typecode='d')
> >> 
> >> The typecodes are as follows:
> >> c - char (non-numeric)
> >> 1 - unsigned char
> >> b - signed char
> >> h - short
> >> i - int
> >> l - long
> >> f - float
> >> d - double
> >> F - complex float
> >> D - complex double
> 
> I'm not fond of this, as it breaks normal Python typing; let the matrix
> figure out what's there. Matter of fact, might be a good idea to make the
> matrix a general container like lists are now, so you can store, say, a
> tuple representing a complex number in there? As it stands, with these
> restrictions you can forget it. However, this makes it slightly more work
> for those wonderful Fortran library routines.

But the data must be stored in a homogenous format that can be passed
to C and Fotran libraries. Also, libraries often require specific
types, so you want control over the types used.

I suppose that the routines could be made smart enough to sniff at the
first element and check for numeric, complex (objects with re and im
attributes), or string data and pick the default type accordingly.

Note that the routines are already smart enough to convert arbitrary
compatible objects to the necessary format.

(snip)
 
> >> concat(l) - l must be a list of matrices, and each matrix in l must have  
> >> the same number of dimensions, as well as the same size for every  
> >> dimension except for the first.  This will return a new matrix M whose  
> >> first dimension is the sum of the first dimensions of all the matrices in  
> >> the list, and whose data is filled from the data in all of these matrices.
> >> 
> >> ie. concat([A,B]) -> [1,2,3,11,12,13]
> 
> Hm. Seems to run into difficulty with >1D matricies. Ideally, concat should take
> 
>         2       3       4               5       6       7
>                               and
>         5       6       7               8       9       10
> 
> and return
> 
>         2       3       4       5       6       7
> 
>         5       6       7       8       9       10
> 
> But the given behaviour of concat depends greatly on whether [[x],[y]] is
> stored by rows or by columns, ie. is it like this:
> 
>         [       |       ,       |       ]
>                 V               V
>                [x]             [y]
> 
> or like this:
> 
>         [
>         ->      [x]
>         ,
>         ->      [y]
>         ]
> 
> This is a conceptual difficulty.

No, it is not a problem as long as you use the interpretation that
n-dimensional matrices are sequences of n-1-dimensional matrices.
 
Jim

=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================