[SciPy-dev] Thoughts on weave improvements

Mon Feb 11 22:37:55 EST 2002

hi,

>>>>> "PP" == Pearu Peterson <pearu at cens.ioc.ee> writes:

    PP> 3) About callbacks. Indeed, when using 'some hairy, expensive
    PP> python code' from C/C++/Fortran, most of the time is spent in
    PP> Python (as Prabhu tests show, pure Python is approx 1000 times
    PP> slower than any compiled language mentioned above). However,
    PP> this 'some hairy, expensive python code' need not to be pure
    PP> Python code, it may contain calls to extension modules that
    PP> will speed up these callback functions. So, I don't think that
    PP> calling callbacks from C/C++/Fortran would be remarkably
    PP> expensive unless these callbacks will be really simple (true,
    PP> sometimes they just are).

I think I understand what you are getting at.  However the following
points are to be noted.  My laplace example is truly a toy example and
its intention was to create a simple benchmark and get the new weave
user started quickly.

  (0) It was a useful benchmark with realistic esitmates for a simple
  problem.

  (1) I did not do anything fancy.  

  (2) There was just one inner loop that was expensive.  And here too,
  it was the for loop in Python that was 1000 times slower.  Function
  call overhead in Python was not anywhere near as bad.  So if at all
  some conclusion about speed is to be determined it is this -- for
  loops in Python are horribly slow.  Function call overhead while
  larger than in C/C++ is not too bad.

  (3) A more sophisticated problem would involve far more complexity
  than my silly example.  Here are a few things that would certainly
  be important.

    (a) Wrapping simple classes.  This means speeding up access to
    members.  Its obvious that one can use an OOD to deal with true
    complexity.  If one does not do this and simply restricts oneself
    to deal with optimized functions then one might as well develop in
    pure C/Fortran.  The advantage of a high level language is to be
    able to do more than what you can do with C/Fortran easily.

    (b) A common way of dealing with complex problems is to construct
    an array of similar objects and then invoking some method on each
    of these.  Take for instance an unstructured grid problem.  You'd
    construct elements and ask each element to take a time step.  Its
    a very natural way to program and its important to speed these
    things up.  If we dont then you have to keep redesigning code just
    to get performance which is a pain.  I admit that this is a long
    term goal but its important to keep in mind.

    (c) Its fine if fancy features of Python are not supported but the
    basics must be.  What those basic features are should be explored
    in greater detail.

  I need more time to think up a more comprehensive and sensible list
  of things.

    PP> On Fri, 8 Feb 2002, Pat Miller wrote:

    PP> <snip>

    >> It might be better if the integration routine, using its
    >> knowledge that the argument to x must be a PyFloat (C++ double)
    >> could use a C++ accelerated function instead of slow callbacks
    >> to Python.  Not important for a quick numeric integration, but
    >> crucial for using Python as a true C/FORTRAN replacement.

    PP> The last statement I find hard to believe. It is almost too
    PP> trivial to write down few reasons (don't let your customers to
    PP> know about these ;-): 1) Python (or any other scripting
    PP> language) cannot never compete with C/FORTRAN in
    PP> performance. (I truly hope that you can prove me to be wrong
    PP> here.)  2) There are too many high-performance and well-tested
    PP> C/FORTRAN codes available (atlas,lapack,fftw to name just few)
    PP> and to repeat their implementation in Python is unthinkable,
    PP> in addition to the point 1).

I beg to differ here.  altas, lapack, fftw etc. only solve some very
fundamental and focussed problems.  If you want complex here is an
example -- solve the flow inside a supersonic jet engine.  Its
horribly hard and afaik there aren't full fledged solvers that handle
this sort of complexity (completely).  People do use CFD for this case
but also use a lot of empirical knowledge.  I am not aware of a purely
CFD package that can handle the complete jet engine in software with
no empirical input.  I'm sure Pat/Eric will know (or atleast heard) of
much harder problems.

Solving such problems without loosing your hair is a challenge and its
pretty clear to me that using an OO design is the way to go.  You cant
achieve that with a hybrid c/fortran/Python option because you'll have
to implement most of the objects in C in which case you do not get the
advantage of developing in Python at all -- so why not just forget
Python and write the whole darned thing in C?  Honestly this is what
most folks do.  The reason moving this to Python is an important goal
is that it is definitely much easier and nicer coding in Python.  The
development cycle is much faster and easier.  Its easier to
map/explore the problem out in Python than in C/C++.  Rather than
struggle with the problem *and* struggle with C/C++/Fortran, one just
has to struggle with the problem if one is using Python.  But this
costs you some speed.  The question is how much and how much more
complex a problem can you solve by moving to Python.

Speed of code is not everything.  Its known that structured grid
solvers are significantly faster than unstructured grid problems.  But
if you have a complex geometry it might take 6 months for someone to
just generate a structured grid for the geometry and then solve it.
OTOH, you can generate a full unstructured grid in a few days (I think
that is conservative).  So its not true that complexity (and the
consequent slowdown) is a bad thing.  

    PP> Simple, transparent, and straightforward mixing of PP>
    Python/C/C++/Fortran languages seems to be the way to go in PP>
    scientific computing within Python.

There are issues with this also.  Its easy to wrap something that can
be encapsulated to a few functions.  Anything more complex than that
and you have to wonder if Python is a good choice at all.  So PyCOD
and weave are great steps in this direction.  Also, as Eric pointed
out most folks aren't comfortable creating wrapper functions all the
times.  At the very least PyCOD and weave simplify this.

regards,
prabhu