About alternatives to Matlab

Sun Dec 3 10:58:52 EST 2006

Jon Harrop wrote:
> >> In particular, I think you are eagerly
> >> allocating arrays when, in a functional language, you could just as
> >> easily compose closures.
> >
> > You are completely wrong.
>
> I'll give an example. If you write the Python:
>
>   a[:] = b[:] + c[:] + d[:]
>
> I think that is equivalent to the ML:
>
>   fill a (map2 ( + ) (map2 ( + ) b c) d)
>
> which can be deforested in ML to avoid the creation of the intermediate
> result b[:] + c[:] by using a closure to add three values at once:
>
>   fill a (map3 (fun b c d -> b + c + d) b c d)
>
> which will be much faster because it doesn't generate an intermediate array.

Ah, but, this wasn't about temporaries when you spoke of "eagerly
allocating arrays", was it?  But yes, you're right, in this example
temporary arrays are created; arrays that Ocaml would not need.  (I
don't exactly understand why you'd need a functional language to get
this optimization, though.)

You might have guessed that you can get rid of most temporary arrays,
at the expense of readability, with numpy.  For example, replacing

d1[:] = odd[:] - C1*even[:]

with

numpy.multiply(-C1,even,d1)
numpy.add(d1,odd,d1)

eliminates the intermediate.  No unnecessary array allocations; no
closures necessary.

I'm curious whether this shared-slicing isn't, in some ways,
advantageous over the Ocaml way.  I presume that in Ocaml, the way
you'd "share" array data is to create a closure can apply some
operation to selected elements of the array, returning the result.  But
could this as easily modify the array in-place?  I presume a good
compiler could be able to inline the function and optimize the call to
an in-place operation, but I'm not sure how well this works in
practice.

Here's a version of D4_Transform that uses no temporary arrays (aside
from the work arrays s1, d1, and d2, which are allocated only once).
It's about 40% faster for me than the one with infix operations.  I'd
be curious how it compares to a correct Ocaml version.  (I'd still
expect Ocaml to be at least twice as fast.)

def alt_D4_Transform(x, s1=None, d1=None, d2=None):
   add = numpy.add
   multiply = numpy.multiply
   C1 = 1.7320508075688772
   C2 = 0.4330127018922193
   C3 = -0.066987298107780702
   C4 = 0.51763809020504137
   C5 = 1.9318516525781364
   if d1 == None:
      d1 = numpy.zeros(x.size/2,numpy.Float)
      s1 = numpy.zeros(x.size/2,numpy.Float)
      d2 = numpy.zeros(x.size/2,numpy.Float)
   odd = x[1::2]
   even = x[:-1:2]
   multiply(-C1,even,d1)
   add(d1,odd,d1)
   multiply(C2,d1,s1)
   add(s1,even,s1)
   d2[0] = C3 * d1[-1]
   multiply(C3,d1[:-1],d2[1:])
   add(s1,d2,s1)
   d2[0] = d1[0] + s1[-1]
   add(d1[1:],s1[:-1],d2[1:])
   multiply(C4,s1,even)
   multiply(C5,d2,odd)
   if x.size > 2:

alt_D4_Transform(even,s1[0:even.size/2],d1[0:even.size/2],d2[0:even.size/2])

> > It seems to
> > me a big help is the ability to fold multiple array operations into a
> > single loop, which is optimization a dynamically-typed language like
> > Python can't easily make.  (It'd require are really smart JIT compiler
> > or some concessions in dynamicity.)
>
> Writing a JIT to compile this kind of stuff is easy.

Eh, getting a JIT to do the optimization I spoke of in Python is not
easy.  It would be relatively easy in a statically typed language.  In
Python, it'd be tantamount to rewriting a dynamically reconfigurable
version of numpy--you'd have to duplicate numpy's complex
type-dispatching rules.

> My point is that this
> is fundamentally bad code,

Whoa, there.  I realize that for people who prefer functional
programming, with their recursively reductionist way of thinking, it
might seem as if leaving any sort of reducibility in final result is
"fundamentally bad", but that's a pretty strong thing to say.

> so why bother trying to write a Python JIT? Why
> not just write in a better language for this task? Optimising within a
> fundamentally slow language seems silly to me.

Because, for most people, language choice is not a greedy maximizing of
a single issue.  Nobody who uses numpy is under the impression that it
can match a statically-typed language that is compiled to machine code
in peak performance.  (Matlab is fair game, of course.)  But speed is
not the only important thing.

I use Python because it's a excellently designed language that fits my
overall needs, and numpy because sometimes I need more speed than
vanilla Python.  Only when speed is critical (for example, 3D collision
detection) do I write an extension in a "better language for the task"
(i.e. C).

This is something that's quite popular in the numerically-intensive
computing community, BTW.  Many people use Python to handle boring
stuff like file I/O, memory managment, and non-critical numerical
calculations, and write C or Fortran extensions to do the
numerically-intensive stuff.

Carl Banks