About alternatives to Matlab

Tue Dec 5 10:35:37 EST 2006

I doubt that anyone would dispute that even as boosted by Numpy/Scipy,
Python will almost certainly be notably slower than moderately
well-written code in a compiled language.  The reason Numpy exists,
however, is not to deliver the best possible speed, but to deliver
enough speed to make it possible to solve large numerical problems with
the powerful and flexible Python language.  As observed by Hans
Latangen in _Python Scripting for Computational Science_, 2nd ed.,
Springer 2005, scientific computing is more than number crunching:

"Very often programming is about shuffling data in and out of different
tools, converting one data format to another, extracting numerical data
from a text, and administering numerical experiments involving a large
number of data files and directories.  Such tasks are much faster to
accomplish in a language like Python than in Fortran, C, C++ or Java."

He might well have mentioned the importance of developing nice-looking
reports once the analysis is complete, and that development is simpler
and more flexible in Python than a compiled language.

In principle, I agree that heavy-duty number-crunching, at least if it
has to be repeated again and again, should be accomplished by means of
a compiled language.  However, if someone has to solve many different
problems just one time, or just a few times (for example if you are an
engineering consultant), there is an excellent argument for using
Python + Numpy.  Unless it affects feasibility, I opine, computational
speed is important primarily in context of regular production, e.g.,
computing the daily "value at risk" in a commodity trading portfolio,
or making daily weather predictions.

I am aware of the power and flexibility of the OCaml language, and
particularly that an OCaml user can easily switch back and forth
between interpreted and compiled implementation.  I'm attacted to OCaml
and, indeed, I'm in the process of reading Smith's (unfortunately not
very well-written) _Practical OCaml_.  However, I also understand that
OCaml supports only double-precision implementation of real numbers;
that its implementation of arrays is a little clunky compared to
Fortran 95 or Numpy (and I suspect not as fast as Fortran's); and that
the available libraries, while powerful, are by no means as
comprehensive as those available for Python.  For example, I am unaware
of the existance of an HDF5 interface for OCaml.

In summary, I think that OCaml is an excellent language, but I think
that the question of whether to use it in preference to Python+Numpy
for general-purpose numerical analysis must rest on much more than its
computational speed.

Jon Harrop wrote:
> Filip Wasilewski wrote:
> > Besides of that this code is irrelevant to the original one and your
> > further conclusions may not be perfectly correct. Please learn first
> > about the topic of your benchmark and different variants of wavelet
> > transform, namely difference between lifting scheme and dwt, and try
> > posting some relevant examples or use other topic for your benchmarks.
>
> Your lifting implementation is slightly longer and slower than the
> non-lifting one but its characteristics are identical. For n=2^25 I get:
>
> 1.88s C++ (816 non-whitespace bytes)
> 2.00s OCaml (741 b)
> 2.33s F# (741 b)
> 9.83s Your Python (1,002 b)
>
> The original python was 789 bytes.
>
> > Neglecting this issues, I wonder if it is equally easy to juggle
> > arbitrary multidimensional arrays in a statically typed language and do
> > operations on selected axis directly from the interactive interpreter
> > like in the NumPy example from my other post -
> > http://groups.google.com/group/comp.lang.python/msg/a71a5bf00a88ad57 ?
> > I don't know much OCaml so it would be great if you could elaborate on
> > this.
>
> It is probably just as easy. Instead of dynamic typing you have parametric
> polymorphism. If you want to make your function generic over arithmetic
> type then you can pass in the arithmetic operators.
>
> >> 0.56s C++ (direct arrays)
> >> 0.61s F# (direct arrays)
> >> 0.62s OCaml (direct arrays)
> >> 1.38s OCaml (slices)
> >> 2.38s Python (slices)
> >> 10s Mathematica 5.1
> >>
> >> Note that all implementations are safe (e.g. C++ uses a.at(i) instead of
> >> a[i]).
> >
> > So why not use C++ instead of all others if the speed really matters?
> > What is your point here?
>
> 1. Benchmarks should not just include two inappropriate languages.
> 2. Speed aside, the other languages are more concise.
>
> > Could you please share full benchmark code, so we could make
> > conclusions too?
>
> I'll paste the whole programs at the end...
>
> > I would like to get to know about your benchmark
> > methodology. I wonder if you are allocating the input data really
> > dynamically or whether it's size is a priori knowledge for the
> > compiler.
>
> The knowledge is there for the compiler to use but I don't believe any of
> them exploit it.
>
> >> In this specific context (discrete wavelet transform benchmark), I'd have
> >> said that speed was the most important thing after correctness.
> >
> > Not necessarily, if you are doing research with different kinds of
> > wavelets and need a general, flexible and easily adaptable tool or just
> > the computation speed is not considered a bottleneck.
>
> Brevity is probably next.
>
> > Language speed is a great advantage, but if it always was the most
> > important, people would talk directly to the CPUs or knit DSP chips in
> > theirs backyards whenever they need to solve a calculation problem.
>
> Sure.
>
> > Please don't make this discussion heading towards `what is the fastest
> > way of doing d4 transform and why OCaml rules` instead of `what is the
> > optimal way of doing things under given circumstances and still have a
> > free afternoon ;-)`.
>
> Comments like that seem to be based on the fundamental misconception that
> writing C++ like this is hard.
>
> > Different tasks need different, well-balanced
> > measures. Neither Python nor OCalm nor any other language is a panacea
> > for every single problem.
>
> Absolutely. OCaml is as good as the next (compiled) language in this case.
> Python and Matlab seem to be particularly bad at this task.
>
> Here's my complete OCaml:
>
> let a = sqrt 3. and b = 4. *. sqrt 2.
> let h0, h1, h2, h3 =
>   (1. +. a) /. b, (3. +. a) /. b, (3. -. a) /. b, (1. -. a) /. b
> let g0, g1, g2, g3 = h3, -.h2, h1, -.h0
>
> let rec d4_aux tmp a n =
>   let n2 = n lsr 1 in
>   for i=0 to n2-2 do
>     tmp.(i) <- a.(i*2)*.h0+.a.(i*2+1)*.h1+.a.(i*2+2)*.h2+.a.(i*2+3)*.h3;
>     tmp.(i+n2) <- a.(i*2)*.g0+.a.(i*2+1)*.g1+.a.(i*2+2)*.g2+.a.(i*2+3)*.g3;
>   done;
>   tmp.(n2-1) <- a.(n-2)*.h0 +. a.(n-1)*.h1 +. a.(0)*.h2 +. a.(1)*.h3;
>   tmp.(2*n2-1) <- a.(n-2)*.g0 +. a.(n-1)*.g1 +. a.(0)*.g2 +. a.(1)*.g3;
>   Array.blit tmp 0 a 0 n;
>   if n > 4 then d4_aux tmp a (n lsr 1)
>
> let d4 a =
>   let tmp = Array.make (Array.length a) 0. in
>   d4_aux tmp a (Array.length a)
>
> let () =
>   print_endline "Generating test data...";
>   let x = Array.init (1 lsl 25) (fun _ -> Random.float 1.) in
>   print_endline "Benchmarking...";
>   let t1 = Sys.time() in
>   ignore(d4 x);
>   Printf.printf "Elapsed time is %.6f seconds\n" (Sys.time() -. t1)
>
> and C++:
>
> #include <iostream>
> #include <vector>
> #include <ctime>
> #include <cmath>
>
> using namespace std;
>
> double a = sqrt(3), b = 4 * sqrt(2);
> double h0 = (1 + a) / b, h1 = (3 + a) / b, h2 = (3 - a) / b, h3 = (1 - a) /
> b;
> double g0 = h3, g1 = -h2, g2 = h1, g3 = -h0;
>
> void d4(vector<double> &a) {
>   vector<double> tmp(a.size());
>   for (int n = a.size(); n >= 4; n >>= 1) {
>     int half = n >> 1;
>
>     for (int i=0; i<n-2; i+=2) {
>       tmp.at(i/2) = a.at(i)*h0 + a.at(i+1)*h1 + a.at(i+2)*h2 + a.at(i+3)*h3;
>       tmp.at(i/2+half) = a.at(i)*h3 - a.at(i+1)*h2 + a.at(i+2)*h1 -
> a.at(i+3)*h0
> ;
>     }
>
>     tmp.at(half-1) = a.at(n-2)*h0 + a.at(n-1)*h1 + a.at(0)*h2 + a.at(1)*h3;
>     tmp.at(2*half-1) = a.at(n-2)*h3 - a.at(n-1)*h2 + a.at(0)*h1 -
> a.at(1)*h0;
>
>     copy(tmp.begin(), tmp.begin() + n, a.begin());
>   }
> }
>
> int main() {
>   cout << "Generating data...\n";
>   vector<double> a(1 << 25);
>   for (int i=0; i<a.size(); ++i) a.at(i) = rand();
>   cout << "Benchmarking...\n";
>   double t1 = clock();
>   d4(a);
>   cout << "Elapsed time " << (clock() - t1) / CLOCKS_PER_SEC << "s\n";
> }
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy
> Objective CAML for Scientists
> http://www.ffconsultancy.com/products/ocaml_for_scientists/index.html?usenet