[Numpy-discussion] the direction and pace of development

Andrew P. Lentvorski, Jr. bsder at allcaps.org
Thu Jan 22 15:03:01 EST 2004


On Thu, 22 Jan 2004, eric jones wrote:

> The effort has fallen short of the mark you set.  I also wish the
> community was more efficient at pursuing this goal.  There are
> fundamental issues.  (1) The effort required is large.  (2) Free time is
> in short supply.  (3) Financial support is difficult to come by for
> library development.

(4) There is no itch to scratch

Matlab is somewhere about $20,000 (base+a couple of toolboxes) per year
for corporations, and something like $500 (or less) for registered
students.  All of the signal processing packages and stuff are all written
for Matlab.  The time cost of learning a new tool (Python + SciPy +
Numeric/numarray) far exceeds the base prices for the average company or
person.

However, some companies have to deliver an end product with Matlab
embedded.  This is *extremely* undesirable; consequently, they are likely
to create add-ons and extend the Python interface.  However, the progress
will likely be slow.

> Speaking from the standpoint of SciPy, all I can say is we've tried to
> do what you outline here.  The effort of releasing the huge load of
> Fortran/C/C++/Python code across multiple platforms is difficult and
> takes many hours.

And since SciPy is mostly Windows, the users expect that one click
installs the universe.  Good for customer experience.  Bad for
maintainability which would really like to have independently maintained
packages with hard API's surrounding them..

> On speed:  <excerpt from private mail to Perry>
> Numeric is already too slow -- we've had to recode a number of routines
> in C that I don't think we should have in a recent project.

Then the idea of optimizing numarray is DOA.  The best you are going to
get is a constant factor speedup in return for vastly complicating
maintainability.  That's not a good tradeoff for a multi-year open-source
project.

> Oh yeah, I have also been surprised at how much of out code uses
> alltrue(), take(), isnan(), etc.  The speed of these array manipulation
> methods is really important for us.

That seems ... odd.  Scanning an array rather than handling a NaN trap
seems like an awful tradeoff (ie. an O(n) operation repeated every time
rather than an O(1) operation activated only on NaN generation--a rare
occurrence normally).

> -- code reviews, build help, release help, etc.  In fact, I double dare
> ya to ask to manage the next release or the documentation effort.
> okay... triple dare ya.

Shades of, "Take my wife ... please!"  ;)

-a




More information about the NumPy-Discussion mailing list