[PYTHON MATRIX-SIG] Release 0.20 of matrix object is available

Chris Chase S1A chris.chase@jhuapl.edu
Fri, 8 Dec 1995 15:59:02 -0500


James> The NEWS for this release follows.  The two major incompatible
James> changes at the very beginning are why this release took so long
James> (that and the fact that I have a few other demands on my time).
James> Please let me know what you think of these changes!

James> As before, don't hesitate to send me your gripes, your bugs,
James> and your praise (if ever merited).  However, I'll be gone for
James> most of next week at the python workshop, so I might be slow in
James> replying.

James> -Jim

James> Major incompatible changes (I expect that these might elict some  
James> interesting discussion):

James> 1) Matrix indexing now returns by reference to the original
James>    data, not by value.  As a side effect of this change,
James>    arbitrary sequences are not allowed in multidimensional
James>    indices, but only single indices, slices, RubberIndex and
James>    None.

I personally prefer general product indexing that allow
multidimensional indexes.  These could be added to the reference
version of indexing but they would require extra baggage by retaining
a copy of the index vector.  This would not be very efficient but is
not a reason to eliminate it since it would not affect the efficiency
of the other types of indexes.

Are references better than copying for indexing? 

I can see that a speed increase is natural when using only references
rather than creating a copy.  But are references always efficient?  If
a multidimensional slice reference is passed as an argument to another
routine that accesses the object multiple times then it may end up
being less efficient.  To avoid this would require that these routines
make contiguous copies of these types of arguments (whether or not the
copy is needed since the reference nature of the object is transparent
to the user).  Similarly a copy to contiguous storage would
need to be done before passing a reference to an external FORTRAN or C
routine.

On the other hand, I can see that references can save memory for very
large matrix objects.  Additionally, a copy can be performed at user
discretion.

I would think that assignment to references that change the originally
indexed object may lead to some surprises for some users used to the
implementations of other matrix languages like Matlab and IDL.

I am not an expert on this.  What are the other pros and cons about
references versus copies being returned by indexing?

James> Note: there is a new method, take, which will allow you to
James> index a matrix with an arbitrary sequence as before.

This then requires a sequence of take() plus [] indexing to obtain
general product indexing.

How does one do assignment using a multi-dimensional index vector?

James> This change was motivated both by the possible speed increases
James> (about 40% for some code) and more importantly by issues of
James> clarity of expression.  I couldn't come up with any other way
James> to make the following hold:

James> m[0][1] is an efficient way to index a multi-dimensional array.
James> m[0,1] == m[0][1]
James> m[0,] == m[0]

It will be difficult to maintain a natural connection between
hierarchical index using [][] versus multi-dimensional indexing.  For
example:

m[0:4,1] != m[0:4][1]

James> 2) No more ranks of operators, instead Yorick style pseudo
James>    indices are used.  As a side effect of this, outer product
James>    is now a convenience function rather than a method on
James>    ofuncs.

James> For functions of unbounded rank (currently the only kind of
James> function supported by my omathmodule) Yorick pseudo indices
James> support a superset of what was possible to express using ranks
James> and outer products.  There is no fundamental reason not to have
James> both pseudo indices and ranks, I just think that it's
James> conceptually cleaner not to mix the two up.

Function ranks offer more generality than what can be done with pseudo
indices.  I would prefer that the matrix module keep function ranks
rather than limit the full generality of the object.  Although I do
not what it takes to add it cleanly.  I have not had time to look
deeply into your code other than browsing the matrixobject.c (the
current state of the code made for rather dense reading).

James> 1) This new release is about 40% faster than 0.11 (on a
James> benchmark code I was given by the folks at LBL).

What was the benchmark?

>> 1) Names
>> 
>> This may seem to be a minor issue, but we should tackle this before
>> we all get used to the current ones.

James> I agree completely!

>> I strongly dislike the type-specific constructors (also used for
>> output). They should be IntegerMatrix, FloatMatrix, ComplexMatrix,
>> CharacterMatrix, and GeneralMatrix. Anyone is free to define shorter
>> names for efficient typing, if desired.

James> Does a FloatMatrix contain C floats or doubles (both are
James> possible in matrices)?  Please define a complete naming scheme
James> that can be compared to the current (admittedly cryptic)
James> typecodes.

Perhaps a naming scheme similar to FORTRAN's INTEGER*8, REAL*4 REAL*8.
The advantage of this is that the precision is known.  With ANSI C the
precision of floats and doubles is not specified.  ANSI C only
enforces a minimum precision on an implementation.

>> I even propose a more radical renaming. Many people associate "matrix"
>> with the 2D-matrices from linear algebra. So it would be better to
>> call our general objects "arrays", and leave the name "matrix" for
>> linear-algebra type objects that are restricted to rank 2 and use *
>> for matrix multiplication.  They could be implemented in Python based
>> on arrays.

I would prefer the term "array" with a specialized "matrix" module
providing linear algebra matrix functions and a matrix multiplication
operator that take 2D arrays as arguments.  But it was pointed out
that we can not use "array" as it is already a different module in
Python.  I don't suppose the module produced by this SIG could
completely replace the current array module (without breadking its old
behavior)?

James> Note: I picked RubberIndex in the hopes of choosing something
James> sufficiently long and hard to type that nobody would assume I
James> meant it to be the final solution.  I had hoped that this might
James> be turned into the syntax ".." one day, but after the python
James> workshop I doubt that this will happen any time soon.

What happened at the workshop to rule out the ".." or "*" syntax for
rubber indexes?

Chris

=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================