New to Python: Features

Thu Oct 28 18:44:06 EDT 2004

<beliavsky at aol.com> wrote:

> aleaxit at yahoo.com (Alex Martelli) wrote in message
news:<1gl6fnk.m1zq7z1fm0nhbN%aleaxit at yahoo.com>...
>  
> > > No, no, I want the C speed.
> > 
> > Then use C: no other portable language (including C++ when properly
> > used) quite matches C's speed across the board.
> 
> I doubt this assertion -- Fortran is portable and is easier to
> optimize than C, in part because pointers are used less.

For numeric computing, yes.  "Across the board", no.  I've done almost
as much Fortran as C in my life... but while it was fully appropriate
back when I was in IBM Research doing (what at the time was called)
"supercomputing", when I moved to a small SW house doing CAD for
mechanical engineering, I soon determined that Fortran's limitations
were costing us a lot in both development time and runtime efficiency.

There's a reason Fortran use has been steadily declining, while C's
still number one despite the many languages that exist at its side or on
top of it: even in technical areas such as CAD, the fraction of code
that does strictly numeric computing is pretty small, overall.  If you
are blessed with superb Fortran coders and super Fortran compilers,
there may be a case for mixed C/Fortran programming, where some strictly
computational kernels are in Fortran while the rest of the programs'
various bottlenecks are in C.  Unfortunately, with the exception of a
few platforms and social niches, superb compilers _and_ coders for/in
Fortran are getting pretty hard to find, so this strategy can be hard to
defend in most cases.

"Strictly numeric computing" isn't even narrow enough.  I regularly do
(for my own private and personal research) huge unlimited-precision
computations with integers and rationals, which of course _IS_ "strictly
numeric" work, to the letter.  Yet _this_ strictly numeric work isn't
really in the field where Fortran enjoys any advantage, to the contrary.
The computational kernels are best coded in C and machine language, and
best coordinated/glued from Python.  When my code is spending 80% of its
time obsessively repeating a mere thousand or so lines of machine code,
it would be just silly to spend time and energy, and decrease
flexibility, by turning the few hundreds lines of Python that do all of
the algorithmic coordination into thousands of lines of any lower-level
language.

> Fortran 95 and Python with Numeric or Numarray are comparably high
> level languages, but Fortran has a performance edge of 2-100 in the

I disagree on the level -- Numeric reminds me much more of APL, which
was regularly used for exploratory numeric computing in IBM Research,
before the most promising approaches, needing to run on frightening
amounts of data, got recoded in Fortran (contradicting your assertion
that "good programmers don't want to recode": if they're good, they
_are_ in fact quite willing to, if and when top performance is needed in
some computational kernel).  Moreover, Python is definitely higher level
(than even APL used to be) for all the _non_-numeric needs that
scientific programs typically have nowadays, in addition to their
computational kernels.

If, for your range of problems and machines of interest, Numeric's
computational kernels are so inefficient compared to identical
computational kernels coded in Fortran, that's surely interesting -- and
part of the reason a couple toolkits exist to facilitate gluing Fortran
code to Python+Numeric programs (the other part, of course, is enabling
reuse of the great mass of excellent existing Fortran libraries - reuse
is good, and even if performance were identical with a recode, it would
be silly to recode rather than reuse solid, fast, field-tested codes).

That's part of Python's strength in numerical computing: neatest glue in
the world between existing or specially coded computational kernels, be
they in Fortran, C, C++, or machine language as in my (unlimited
precision rationals) case -- _plus_ all the exploratory flexibility one
may want, both in the classic numerical parts (via Numeric/numarray) and
in such tasks as database access, visualization, monitoring, checkpoint
and restart, and so forth.

> tests I have run (several posted to comp.lang.python). According to
> the Pyrex web site http://nz.cosc.canterbury.ac.nz/~greg/python/Pyrex/
> the latest version is 0.9.3 -- this version number implies that Pyrex
> is not mature. Fortran 95 compilers are mature. Nowadays, applications

I've used pyrex for production work without any problems, but yes, once
Open Source authors decide their work is mature they typically tag it as
1.0.  However, not being subject to commercial pressures or deadlines,
they're typically in no hurry whatsoever to reach that milestone; you
should really read Eric Raymond's penetrating analysis of that fact in
his excellent book "The Art of Unix Programming" (also available for
free online reading).

> where performance is critical are often run on Linux clusters (often
> of AMD Opteron processors), and Fortran compiler vendors are
> supporting these platforms. Are Pyrex and other Python projects being
> customized in this way?

I'm not sure what (if anything) is being done with Pyrex in the area of
cluster computing, but there are quite a few other Python projects
there, such as Pycluster.  I don't know much about them, since I don't
have a cluster at hand to try them on -- but you can sure find others
here who do, if you post with more appropriate subjects to catch their
attention.

> > If you don't need portability, C's speed isn't optimal anyway; psyco (a
> > just-in-time optimizer for Python) can sometimes beat C by going
> > directly to machine language (but it only works for intel and compatible
> > CPUs, not for example for the PowerPC chips used in Apple's Mac
> > computers -- that is the downside, of course).
> 
> There are optimizing Fortran 95 compilers for Intel Windows, Intel
> Linux, AMD Opteron on Linux, PowerPC etc., listed at
> http://www.dmoz.org/Computers/Programming/Languages/Fortran/Compilers/
> . You are conceding that Python is less portable if you demand
> performance.

No, I'm not: where the (expletive deleted) do you read that?!  All I'm
saying is that _psyco_ currently targets only Intel and compatible CPUs,
so those are the ones where you can get psyco's own sometimes-beats-C
performance, for now.  Throughout my post, up to here, I was speaking
about _portable performance *across the board*_ (where, generally, C yet
can't be beaten -- although I _have_ seen some claims to the contrary by
fans of Common Lisp and O'Caml, at least for some cases).  Here, I'm
explaining that, if and when you do not need portability, then there may
well be even faster alternatives to C (still "across the board" in terms
of application areas, but not necessarily of platforms).

> There is even a free, fast compiler called G95 for these
> platforms -- see http://www.g95.org .

Are you claiming this is _MATURE_ while pyrex isn't?  Literally
speaking, you are, since you earlier stated, baldly and without any
qualification, that "Fortran 95 compilers are mature", and here you
appear to be implying that G95 is a Fortran 95 compiler -- so, the
sillogism is compelling.

Well, I've visited that huge, rambling homepage - and couldn't find a
release number nor a pointer to the sources.  I did easily find a kind
of blog with all sorts of interesting news, such as (less than a month
ago) the wrong sources getting compiled and causing confusion, etc; if
these are signs of a mature project, we must be using a very different
definition of the word "mature".  And a list of binaries, of which zero
(count'em, zero) are listed for any PPC CPU -- there is a claim later
that the program works on G4 (a PPC CPU, though 32-bit; for power
computing, one would typically be using G5, the newer 64-bit one, these
days; Apple is even putting it in cheap home/consumer computers, and
Virginia's supercomputer, one of the world's most powerful ones it
seems, uses G5 PPC CPU's, too), but I couldn't find any useful link.

Don't get me wrong: I'm overjoyed if open source is reaching good
results in the world of Fortran compilers, at last (the g77 free Fortran
compiler, way back when, was surely anything but a speed demon!-).  But
I think it's dishonest to claim or imply "maturity" for this project
while asserting pyrex is "not mature".  Do you really believe that not
having a release number, or hiding it well, enhances a project's
maturity?  I don't see how you could state this in good faith.

Alex