[PYTHON MATRIX-SIG] Final matrix object renaming and packaging

James Hugunin jjh@Goldilocks.LCS.MIT.EDU
Tue, 16 Jan 96 09:50:00 EST


Hi all.  I just got back in town from an extended Christmas vacation
(I also got married, so I had a good excuse for being gone so long).
Now that I'm back in the office again, I've been thinking that it's
about time that the matrix object made it out into beta release.

There are a few naming conventions and packaging decisions to be made
before this happens.  Below are the conventions that I propose to use.
Once this object is released in beta form I have no intention of
changing the naming conventions without a REALLY good reason, so
please let me know your opinions now.

Every name in quotes should be considered a proposed name open for
discussion.

-----

The "NumericPython" package contains the following major pieces.

1) Konrad's numeric patches to the python core (incorporated into the
working version of python by Guido) will be required and included with
the distribution.  These will be the only patches to the python core
required.
  

2) One C module that must be statically linked called "multiarraymodule.c"

Having this particular module statically linked will eliminate the
need for getting the CObject proposal working before release.

Use "PyArray_" as the name of the Matrix Object.  This is a simple
renaming of the existing "PyMatrix_".

Use "array(sequence, typecode='d')" as the default
constructor for this new C type.

This PyArray C type will not implement automatic type-coercion (unlike the
current implementation).  The reason for this is that I have decided
type-coercion is a major pain for arrays of floats (as opposed to
doubles) and I happen to use arrays of floats for most of my work.  If
somebody can give me a good suggestion for how to keep automatic
type-coercion and have Matrix_f(1,2,3)*1.2 == Matrix_f(1.2,2.4,3.6)
then I might consider changing this decision.  See later note on
Array.py for an alternative.

The include files "arrayobject.h", and "ofuncobject.h" will provide
the needed C interface to the array and optimized function objects.


3) Two dynamically linkable modules called "umathmodule.c", and "ieee_umathmodule.c"
These will both provide "universal" math support, providing the basic
functions of mathmodule, plus things like "greater" and
"booleanOr" for matrices, floats, complex, ints, and generic python
objects.

The basic "umath" will cause python exceptions in the event of an
overflow or divide-by-zero, etc. (this means it will be slow).
"ieee_umath" will not check the arguments, or the results of its
computations, and this should result in standard IEEE overflow, and
NaN values occuring in arrays as well as unpredictable modulo effects
for integer overflows.  (this will be the fast version).


4) Two python objects, "Array.py" and "Matrix.py"

Array is essentially a python binding around the underlying C type,
and this will also provide for automatic type-coercion and will
generally assume that it is only working with arrays of type long,
double, and complex double (the three types of arrays that are
equivalent to python objects).  In my initial tests of this object on
LLNL's simple benchmark, I found that the performance was only 5%
slower than using the C object directly.

Matrix will inherit almost everything from Array, however it will be
limited to 2 or fewer dimensions, and m1*m2 where m1 and m2 are
matrices will perform matrix style multiplication.  If the linear
algebra people would like, other changes can be made (ie. ~m1 ==
m1.transpose(), ...).  Based on the experiments with Array, the
performance penalty for this approach should be minimal.

In order to support these python objects (and others like them), two
special data members will be added, "__array__", and "__object__".  If
an object has the member "__array__", then the C functions that handle
matrices will attempt to retrieve the matrix from this member when
passed in a python object.  In addition, they will attempt to convert
their result to an object of class "__object__" upon return.  This
means that umath.sin(Array([0, pi/2, pi])) == Array([0.,1.,0.]).
Hopefully, this convention will allow these python objects to coexist
well with any numeric libraries.


5) A standard library "Numeric.py" which will be the standard way of
importing multiarray, Array, Matrix, umath, etc.  It will include the
inverted trig functions ("sec", "csc", "arcsec", ...) as well as most
of the standard functions currently in Matrix.py


6) Great documentation and tutorials (hopefully written by Paul
DuBois).


7) A standard test suite.


8) A new version of pickle.py which is aware of matrix objects.  This
will be suggested as a replacement for the existing pickle.py.  Matrix
objects are pickled using a binary format that is endian-aware, so
pickling a matrix object is a very efficient and portable way of both
storing them and sending them around the network.


9?) A "numericmodule.c" which contains reasonably efficient fft,
matrix inversion, convolution, random numbers, eigenvalues and
filtering functions stolen from existing C code so that the package
can be viewed as a "complete" numerical computing system?


Well, that's all I can think of for now, let me know your opinions
before I start my final burst of coding and get this thing polished up
and released.

Planned schedule:

Comments on this proposal until 1/22
Final alpha release 0.30 1/26
Massive use of release 0.30, and lots of good bug reports from users
Bugfixed alpha release 0.31 2/2
More testing, and hopefully final forms of documentation and tutorials
First beta release 1.0beta1 announced to general newsgroup 2/12

-Jim




=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================