[SciPy-user] python (against java) advocacy for scientific projects

Tue Jan 20 09:27:14 EST 2009

On Monday 19 January 2009 16:11:34 Sturla Molden wrote:
> I retain that Java is not fit for scientific computing. There are no
> complex number primitive, no flexible array primitive, and no operator
> overloading.
[snip]
> The same for complex numbers.

I quite agree. I don't believe that Java is suited for matrix-based computing. 
However, the JIT is important for scientific computing that is not primarily 
matrix-based: discrete mathematics & other combinatorial problems are good 
examples. I am aware of psyco, but it does not work with x86_64 & none of the 
clusters I work with have 32-bit python installed any longer; this is pretty 
typical of large companies' R&D departments (like mine or likely, the OP's).

> And because it is statically typed (not
> duck-typed like Python and Matlab), you end up with ugly C++ like
> templates for generic functions.

This is both an advantage and a disadvantage. Bill Baxter pointed out the 
disadvantages on the list.

> C++ is used for scientific computing, particularly by younger
> scientists. But it remains that the majority of hard-core computational
> scientists prefer Fortran over C++ when native compilation is required.

Really? See www.cern.ch, wci.llnl.gov, etc. for hard-core computational 
scientists who prefer C++ over Fortran, many whom have been around for a few 
decades. You could complain that they have been using c++ only for 5-10 years, 
but then C++98 is only 10 years old, and reasonably conforming C++ compilers 
are only 5 or so years old. If you complain that C++ is such a complex 
language that it took 5 years for the majority of compilers to get it right, 
then I'd point to Fortran95 and ask for the length of time for freely 
available compilers to become reasonably conformant. All such language changes 
take a while to get implemented.

> I guess C++ templates is fine if you like bloatware. And C++ template
> metaprogramming is fantastic if you want to write unmaintainable code.

Really? Try writing just a fixed-point radix-8 FFT which handles complex input 
vectors up to, say, 64K in length with flexible rounding/clipping strategies 
with Fortran/python. I bet one could not write one that is even half as 
maintainable and half the performance of the C++ version. Or, for that matter, 
try writing something like Macaulay2 (or any nontrivial group-theoretic 
algorithms) on Fortran/python.

Code maintainability works by using clearly defined idioms. 5 or 10 years ago, 
no such idioms had been developed (apart from the STL) for template 
metaprogramming. The story is now different; check out boost.fusion, 
nt2.sourceforge.net or the eigen library. Similar idioms/patterns are now 
still under development for python generators (or the cool stuff from 
Twisted). As with any tool (like C++ or linear algebra), you have to learn how 
to use it.

> Hey it's even proven to be a Turing complete 'language'. But why go
> through all of that pain just to match the performance of good old
> Fortran? I known an easier way ... just write Fortran instead.

First, Fortran, as I pointed out above, is generally worthless for a lot of 
computation-intensive problems that don't map to its native data types. 

Second, Fortran is not magic; it simply uses optimized libraries underneath 
and the speed of Fortran compiled code depends upon the libraries but you can 
beat those libraries from C++ (because template metaprogramming can be used to  
provide more information to the compiler), e.g., see
  http://eigen.tuxfamily.org/index.php?title=Benchmark

Third, computation speed now on CotS processors depends more on cache & memory 
access optimization than anything else, which compilers can do with C/C++ just 
as well as with Fortran; the days of Fortran being the golden benchmark are 
long over. C/C++ (among others) have caught up. Note that virtually all major 
compiler vendors (including Microsoft, Intel, SGI & GCC) use the same code 
generation back-end for Fortran/C/C++ with the only difference being the 
amount of information that can be passed through the front-end; in this case, 
C++/C# can actually provide more information to the back-end (for use in 
optimization) because of the availability of compile-time scriptability.

Fourth, C++ can be easier to write than Fortran. You could object that writing 
such a C++ library is difficult, but the point is that Eigen or MTL needs to 
be written only once (just as you would write only once the Fortran compiler 
where this knowledge is embedded for Fortran).

Fifth, try getting a decent Fortran compiler for homegrown embedded systems.

Personally, I had a very difficult time switching from Fortran to C++, but 
with the benefit of hindsight, I realize that my initial resistance followed 
more from NIH and from familiarity with Fortran. At this point, I haven't 
found an easier tool than the combination of python/C++/Qt/CMake.

> To compare Python with Matlab for scientific computing, here it at least
> some points to consider:

I completely agree here; I am betting huge at my current company on switching 
successfully from Matlab to python. I was merely pointing out the differences 
for the OP who works at a big company where the cost of Matlab is not likely 
to be an issue.

Regards,
Ravi

____________________________________________________________________________________
Start a rewarding Medical Transcriptionist career. Click to find affordable 
and flexible programs. 
http://ads.lavabit.com/fc/PnY6tWr4ToNBjrYeUVAHcbUcX4W3wxlv3MIw8hjM7lruhBrrmtXzJ/
____________________________________________________________________________________

-------------------------------------------------------