Mathematica 7 compares to other languages

Sun Dec 7 17:53:49 EST 2008

For those interested in this Mathematica problem, i've now cleaned up
the essay with additional comments here:

• A Mathematica Optimization Problem
  http://xahlee.org/UnixResource_dir/writ/Mathematica_optimization.html

The result and speed up of my code can be verified by anyone who has
Mathematica.

Here's some additional notes i added to the above that is not
previously posted.

-------------------------

Advice For Mathematica Optimization

Here's some advice for mathematica optimization, roughly from most
important to less important:

    * Any experienced programer knows, that optimization at the
algorithm level is far more important than at the level of code
construction variation. So, make sure the algorithm used is good, as
opposed to doodling with your code forms. If you can optimize your
algorithm, the speed up may be a order of magnitude. (for example,
various algorithm for sorting algorithms↗ illustrates this.)

    * If you are doing numerical computation, always make sure that
your input and every intermediate step is using machine precision.
This you do by making the numbers in your input using decimal form
(e.g. use “1.”, “N[Pi]” instead of “1”, “Pi”). Otherwise Mathematica
may use exact arithmetics.

    * For numerical computation, do not simply slap “N[]” into your
code. Because the intermediate computation may still be done using
exact arithmetic or symbolic computation.

    * Make sure your core loop, where your calculation is repeated and
takes most of the time spent, is compiled, by using Compile.

    * When optimizing speed, try to avoid pattern matching. If your
function is “f[x_]:= ...”, try to change it to the form of “f=Function
[x,...]” instead.

    * Do not use complicated patterns if not necessary. For example,
use “f[x_,y_]” instead of “f[x_][y_]”.

------------------------------

...

Besides the above basic things, there are several aspects that his
code can improve in speed. For example, he used rather complicated
pattern matching to do intensive numerical computation part. Namely:

Intersect[o_, d_][{lambda_, n_}, Bound[c_, r_, s_]]
Intersect[o_, d_][{lambda_, n_}, Sphere[c_, r_]]

Note that the way the parameters of Intersect defined above is a
nested form. The code would be much faster if you just change the
forms to:

Intersect[o_, d_, {lambda_, n_}, Bound[c_, r_, s_]]
Intersect[o_, d_, {lambda_, n_}, Sphere[c_, r_]]

or even just this:

Intersect[o_, d_, lambda_, n_, c_, r_, s_]
Intersect[o_, d_, lambda_, n_, c_, r_]

Also, note that the Intersect is recursive. Namely, the Intersect
calls itself. Which form is invoked depends on the pattern matching of
the parameters. However, not only that, inside one of the Intersect it
uses Fold to nest itself. So, there are 2 recursive calls going on in
Intersect. Reducing this recursion to a simple one would speed up the
code possibly by a order of magnitude.

Further, if Intersect is made to take a flat sequence of argument as
in “Intersect[o_, d_, lambda_, n_, c_, r_, s_]”, then pattern matching
can be avoided by making it into a pure function “Function”. And when
it is a “Function”, then Intersect or part of it may be compiled with
Compile. When the code is compiled, the speed should be a order of
magnitude faster.

-----------------------------

Someone keeps claiming that Mathematica code is some “5 order of
magnitude slower”. It is funny how the order of magnitude is
quantified. I'm not sure there's a standard interpretation other than
hyperbole.

There's a famous quote by Alan Perlis ( http://en.wikipedia.org/wiki/Alan_Perlis
) that goes:
“A Lisp programmer knows the value of everything, but the cost of
nothing.”

this quote captures the nature of lisp in comparison to most other
langs at the time the quote is written. Lisp is a functional lang, and
in functional langs, the concept of values is critical, because any
lisp program is either a function definition or expression. Function
and expression act on values and return values. The values along with
definitions determines the program behavior. “the cost of nothing”
captures the sense that in high level langs, esp dynamic langs like
lisp, it's easy to do something, but it is more difficult to know the
algorithmic behavior of constructs. This is in contrast to langs like
C, Pascal, or modern lang like Java, where almost anything you write
in it is “fast”, simply forced by the low level nature of the lang.

In a similar way, Mathematica is far more higher level than any
existing lang, counting other so-called computer algebra systems. A
simple one-liner Mathematica construct easily equates to 10 or hundred
lines of lisp, perl, python, and if you count its hundreds of
mathematical functions such as Solve, Derivative, Integrate, each line
of code is equivalent to a few thousands lines in other langs.

However, there is a catch, that applies to any higher level langs,
namely, it is extremely easy, to create a program that are very
inefficient.

This can typically be observed in student or beginner's code in lisp.
The code may produce the right output, but may be extremely
inefficient for lacking expertise with the language.

The phenomenon of creating code that are inefficient is proportional
to the highlevelness or power of the lang. In general, the higher
level of the lang, the less possible it is actually to produce a code
that is as efficient as a lower level lang. For example, the level or
power of lang can be roughly order as this:

assembly langs
C, pascal
C++, java, c#
unix shells
perl, python, ruby, php
lisp
Mathematica

the lower level the lang, the longer it consumes programer's time, but
faster the code runs. Higher level langs may or may not be crafted to
be as efficient. For example, code written in the level of langs such
as perl, python, ruby, will never run as fast as C, regardless what
expert a perler is. C code will never run as fast as assembler langs.
And if the task crafting a raytracing software, then perl, python,
ruby, lisp, Mathematica, are simply not suitable, and are not likely
to produce any code as fast as C or Java.

On the other hand, higher level langs in many applications simply
cannot be done with lower level lang for various practical reasons.
For example, you can use Mathematica to solve some physics problem in
few hours, or give Pi to gazillion digits in few seconds with just “N
[Pi,10000000000000]”. Sure, you can code a solution in lisp, perl, or
even C, but that means few years of man hours. Similarly, you can do
text processing in C, Java, but perl, python, ruby, php, emacs lisp,
Mathematica, can reduce your man hours to 10% or 1% of coding effort.

In the above, i left out functional langs that are roughly statically
typed and compiled, such as Haskell, OCaml, etc. I do not have
experience with these langs. I suppose they do maitain some advantage
of low level lang's speed, yet has high level constructs. Thus, for
computationally intensive tasks such as writing a raytracer, they may
compete with C, Java in speed, yet easier to write with fewer lines of
code.

personally, i've made some effort to study Haskell but never went thru
it. In my experience, i find langs that are (roughly called) strongly
typed, difficult to learn and use. (i have reading knowledge of C and
working knowledge of Java, but am never good with Java. The verbosity
in Java turns me off thoroughly.)

-----------------

as to how fast Mathematica can be in the raytracing toy code shown in
this thread, i've given sufficient demonstration that it can be speed
up significantly. Even Mathematica is not suitable for this task, but
i'm pretty sure can make the code's speed in the some level of speed
as OCaml.
(as opposed to someone's claim that it must be some 700000 times
slower or some “5 orders of magnituted slower”). However, to do so
will take me half a day or a day of coding. Come fly $300 to my paypal
account, then we'll talk. Money back guaranteed, as i said before.

  Xah
∑ http://xahlee.org/

☄