[PYTHON MATRIX-SIG] Two weeks' experience

Konrad HINSEN hinsenk@ere.umontreal.ca
Sat, 24 Feb 1996 13:41:40 -0500


During the last two weeks I have been very busy with my real work, but
most of this has been done in Python, with lots of arrays.  I was
helping a colleague to locate a bug in his Fortran code that is very
important for both of us. Python has been incredibly useful for this
task, but I have also discovered a few serious problems, which I would
like to report here. My point is mostly to show how important
practical applications are to find weak points in the current design
and implementation.

My first and most frustrating experience was that it is very difficult
to write code that works for both arrays and scalars (or rank-0
arrays). Sure, a one-line expression using operators and mathematical
formulas is no problem. But as soon as there are conditionals, loops,
etc., the easiest way is often to have two separate functions.  All
this is caused mostly by conditional expressions. To check whether all
elements of a are greater than the corresponding elements of b you can
write andLogical.reduce(ravel(a.greater(b))). But if a is a scalar,
this has to be andLogical.reduce(ravel(b.less(a))), and if both a and
b are scalars it must be a > b. I ended up writing a special
comparison function that uses one of these expressions depending on
the type of the arguments.

This shows that no array operations that make sense for rank-0 arrays
and scalars (i.e. most) should be implemented as a method of the array
type. I had reached this conclusion before on the basis of other
arguments, but this practical frustration was much more convincing!

Another problem that cost me two hours to solve was matrix multiplication.
I had a three-dimensional wave function evaluated on a grid, and therefore
stored in the array psi of rank 3. I wanted to apply the operator for kinetic
energy in one direction, which is a rank-2 array, t. My first attempt was
  t.matrixMultiply(psi),
and I expected this to do a dot product using the last axis of tx and the
first of psi. But no, all I got was an exception. So I had to write
this function myself, which I still considered a minor exercise. More
than an hour later, I had arrived at the following code:

  def dot(a1, a2):
      r1 = len(a1.shape)-1
      r2 = len(a2.shape)-1
      axes = [r1] + range(r1)
      a1 = a1.transpose(axes).copy()
      f = a1.reshape
      a1 = apply(f, a1.shape + r2*(1,))
      a2 = a2.copy()
      f = a2.reshape
      a2 = apply(f, a2.shape[0:1] + r1*(1,) + a2.shape[1:])
      return add.reduce(a1*a2)

If you think that this ought to be easier, I agree. The basic idea of
this function is to reshape both arrays to make the axis to be summed
over the first, and to add new axes of length one to make things
match. All the trouble is caused by two unpleasant features of
the method reshape():
1) It works only on contiguous arrays, which makes it necessary
   to copy the arguments first.
2) It wants the new shape in the form of several integer arguments,
   which causes the weird construction used above to obtain a
   rank that is calculated rather than a known constant.

The final operation I want to complain about is fromFunction().  I had
always thought that it calls the supplied function once for each
element, supplying integer indices as arguments. But no, it tries to
be smarter, by calling the function only once with suitably shaped
arrays as arguments. That is of course much more efficient, but it
limits the function to a straightforward sequence of ofuncs, in which
case there is hardly any need to use the fromFunction() constructor.
Most cases I can think of where this constructor makes sense involve
functions with conditional expressions, like my kinetic energy operator:

  def tij(i,j):
      if i == j:
	  return pi**2/3
      else:
	  d = i-j
	  return 2.*(-1.)**d/d**2

So instead of the straightforward and readable expression

  t = (hbar**2/(2.*mass*delta**2)) * fromFunction([ngrid,ngrid], tij)

I had to write the very Fortran-like code

  t = zeros(ngrid,ngrid)
  for i in range(ngrid):
      for j in range(ngrid):
	  t[i,j] = tij(i,j)
  t = (hbar**2/(2.*mass*delta**2)) * t

There were a few more minor problems, but I think I can stop now. The
direct consequences of the problems described here will appear in my
proposal for reorganising the array functions, but most of all I'd
like to urge all of you to use what we have for as many applications
you can think of. And we should seriously consider a public alpha
release to get more testers.

-------------------------------------------------------------------------------
Konrad Hinsen                     | E-Mail: hinsenk@ere.umontreal.ca
Departement de chimie             | Tel.: +1-514-343-6111 ext. 3953
Universite de Montreal            | Fax:  +1-514-343-7586
C.P. 6128, succ. Centre-Ville     | Deutsch/Esperanto/English/Nederlands/
Montreal (QC) H3C 3J7             | Francais (phase experimentale)
-------------------------------------------------------------------------------

=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================