[PYTHON MATRIX-SIG] Final matrix object renaming and packaging

Jim Fulton, U.S. Geological Survey jfulton@usgs.gov
Tue, 16 Jan 1996 13:42:37 -0500


On Jan 16, 12:39pm, James Hugunin wrote:
> Subject: Re: [PYTHON MATRIX-SIG] Final matrix object renaming and packagin
>
>    From: "Jim Fulton, U.S. Geological Survey" <jfulton@usgs.gov>
>
>    > 2) One C module that must be statically linked called
"multiarraymodule.c"
>
>    By statically linked, I assume you mean statically linked into the
interpreter.
>    Right?
>
>    > Having this particular module statically linked will eliminate the
>    > need for getting the CObject proposal working before release.
>
>    As Guido said, the CObject proposal is working.  I'll send it to you in a
>    separate note.
>
>    I finished the (very small) CObject implementation very soon after the
>    workshop because I feel it is important to use it for the Matrix (Numeric)
>    module.
>
>    I feel strongly that the Matrix module should export it's C interface
>    using CObjects so that modules using matrices to not require that any
>    of the matrix software be statically linked.  In my distribution of
>    Python, I statically link as little as possible to keep the
>    interpreter and the interpreter start-up time small.  I
>    plan to modify FIDL to use the CObject-exported interface.
>
>    I'd be happy to assist you with this if you wish.
>
> I guess we run on faster networks at MIT, I never bother with dynamic
> linking, and find that 4MB binaries launch as fast as I wish.

Just out of curiousity, how fast is that?  What do you get from:

  time python -c ''

On some of our slower systems, this takes about a second, even with
almost all modules dynamically loaded.  This is much slower than I'd
like it to be, since Python is often used here for very small scripts
in which this startup time is a significant part of the overall
execution time.  Starting up another version of the interpreter with
Tk linked in takes nearly twice as long.


> I'd be
> more than happy to have somebody else implement the CObject interface,
> but I just don't see the time it would take (including getting myself
> up to speed on the vagaries of dynamic linking) is worthwhile.  If you
> (Jim Fulton) want to try to add this interface, I'd be happy to help.

I'll take a crack at this when you release 0.3.

>    >
>    > Use "PyArray_" as the name of the Matrix Object.  This is a simple
>    > renaming of the existing "PyMatrix_".
>    >
>    > Use "array(sequence, typecode='d')" as the default
>    > constructor for this new C type.
>
>    Is this a replacement for the existing array type?
>
> I decided it was not worth the huge set of compromises that would have
> been necessary to make the matrix/array object truly compatible with
> the existing array object.  Still, the right name for this object
> really is array.
>
> There's no problem with using PyArray as the C name for the object
> because the existing array object does not export an interface.  Also,
> there's no problem with using the name "array" as a constructor
> because avoiding these sorts of naming conflicts is why python has a
> module system in the first place.  No existing code that imports
> "arraymodule" will be broken, but hopefully people in the future will
> start using the new multiarraymodule for the same tasks.

But existing imports of array on some systems may be broken.  Even though the
array module is stored in arraymodule._, it is imported with "import array".

>    > 4) Two python objects, "Array.py" and "Matrix.py"
>
>    Are these imported by Numeric, or would the user be importing
> these?
>
> I plan to have everything in the basic distribution imported into a
> flat name space under the Numeric module.  However, there's nothing to
> stop people from importing these independently if they wish.
>
>    Is the current built-in array module going away?  If not, then there
>    is a name conflict on case-insensitive file systems.
>
> The array module is not going away.  I'm confused about the problem of
> case-insensitive file systems, what do they do with tkintermodule and
> TkInter?

Nothing, since Tkinter doesn't work on these systems yet. :-)

> If this is in fact an issue, then "Array.py" can be changed
> to "UserArray.py" in the spirit of UserList, etc.

I think it should.

> 	I like the idea
>    of having the user import Numeric and then use the
>    Numeric.Matrix_d(...) or Numeric.Array_f(...) rather than than
>    importing Matrix and Array.  Is there any reason for the user to
>    import Array directly?  I am strongly opposed to using "Array" unless
>    the "array" module goes away.
>
> I agree with all this (except the last line, which I'm willing to
> conceed after you explain my tkinter question).

Great. ;-)

>    > In order to support these python objects (and others like them), two
>    > special data members will be added, "__array__", and "__object__".  If
>    > an object has the member "__array__", then the C functions that handle
>    > matrices will attempt to retrieve the matrix from this member when
>    > passed in a python object.
>
>    Are we taking about python members or C structure members?  Is the
>    __array__ member supposed to be the C pointer to a block of memory?
>
> The __array__ member is a member of a python object which is expected
> to contain a python object of type array (the type created in C that
> this whole thing is based on).
>
>    > In addition, they will attempt to convert
>    > their result to an object of class "__object__" upon return.
>
>    Class __object__?  So __object__ is a pointer to a Python class
>    object?
>
> This is still a python member.  In python what it would do is call
> m.__object__(new_array).  I assume that a similar thing can be done in C
> (I haven't implemented this in C yet).

And new_array is one of the new built-in array objects?

>    > This
>    > means that umath.sin(Array([0, pi/2, pi])) == Array([0.,1.,0.]).
>
>    OK.  This makes sense
>
> Remember that Array is a python object here, that's the trick I'm
> trying to make work out.

So all of this is really about being able to derive Python classes from
built-in types?  That is, you want an Array (which is an instance of a Python
class) to store it's data in an array (which is an object of type PyArrayType),
and you want functions that you pass an Array to to get at it's array.  Have I
got this right?

(BTW, you should export the actual type objects.)

>    > Hopefully, this convention will allow these python objects to coexist
>    > well with any numeric libraries.
>
>    Could you provide some additional details?
>
> Here's a bit of code for a unary function expecting a single PyArray
> argument of type "double" of two dimensions:
>
> 	PyObject *op;
> 	PyArrayObject *ap, *rp;
>
> 	TRY(PyArg_ParseTuple(args, "O", &op));
> 	TRY(ap = PyArray_ContiguousFromObject(op, PyArray_DOUBLE, 2, 2));
>
> 	// Do something with ap to get rp
>
> 	Py_DECREF(ap);
>
> 	return PyArray_Return(rp, op);
>
> With the exception of the second argument to PyArray_Return, this is
> the current way of writing such a chunk of code.
>
> PyArray_ContiguousFromObject will convert any python sequence type to
> an array of the appropriate type and dimensions if possible.  If the
> argument is already an array of the appropriate type and dimensions,
> then that array will be increfed and returned (unless its data points
> to a discontiguous chunk of memory in which case it will be copied
> into a new array with contiguous memory).
>
> The new feature that I want to add to this function is that if its
> argument is a python object with the attribute "__array__", then this
> function wil return the PyArrayObject contained in that attribute (if
> this is indeed the case).

OK.  If my statement above is right, then I understand this.

> PyArray_Return is used because some operations wind up producing a
> 0-dimensional array.  These will be converted to the appropriate
> python scalars on return.
>
> The new feature that I want to add here is that if the second argument
> has a "__class__" attribute, then the constructor for that class will

You mean __object__? (I like __class__, or maybe even
__return_constructor__ better.)

> be used to return a new python object with the returned PyArrayObject
> in its "__array__" attribute.

So PyArray_Return checks to see of op has a callable __object__ member and if
it does, returns the result of calling this member with rp as an argument.
Right?

> This is the simplest method I could come up with to get my
> "sin(Array())" example to work.

Whew.  I need to think about this.  I'm not faulting your approach, but it
feels a bit complicated.

What if you had a function with multiple arguments and you wanted the
returned object to have the same type as the arguments?  For example,
what if you wanted

  spam(some_Array, some_other_Array) to return an Array and
  spam(some_Matric, some_other_Matrix) to return a Matrix?

Would you use the first argument's __object__ or the second's?

I'll probably have more to say about this after I take some time to mull it
over.

>
>    >
>    > 6) Great documentation and tutorials (hopefully written by Paul
>    > DuBois).
>
>    Wat cool.  Will we also get doc strings?
>
> doc strings are already done (probably could use some polishing, but...

Great!

Jim


=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================