[Numpy-discussion] Numeric3

Peter Verveer verveer at embl.de
Sun Feb 6 05:28:45 EST 2005


> This is exactly why I am dissatisfied with the way numarray has been 
> advertised (I'm not blaming anyone here, I recognize the part that I 
> played in it).  Third-party package providers starting to build on top 
> of numarray before it was a clear replacement.   I don't see anything 
> in nd_image that could not have sat on top of Numeric from the 
> beginning.    I don't see why it needed to use numarray specific 
> things at all.   By the way, nd_image it is a very nice piece of work 
> that I have been admiring.

Thanks. nd_image could sit easily on top of Numeric, it sits on top of 
numarray simply because I felt numarray was the better package. And 
this is still the case. There is however nothing inherent in nd_image 
that requires numarray. In fact, I am contemplating factoring out the 
image processing functionality into a separate C library, on which 
python packages then would be based.


>> common standard is a very good idea. But right now I don't find SciPy 
>> attractive as a framework, because 1) it is too big and not easily 
>> installed. 2) it is not very well documented. Thus, I prefer to write 
>> my package against a smaller code-base, in this case Numarray, but it 
>> could have also been Numeric. That has the advantage that people can 
>> install it more easily, while it still can be included in things like 
>> SciPy if desired.
>
> What is too big about it?  Which packages would you like to see not 
> included?

I think I was unfair to call scipy 'big'. There is nothing inherently 
bad about being big. The problem is rather installation, I suppose. 
Also I would like to see modularity, if I only want say, least-squares 
fitting, I should be able to just take that.

> Is it the dependence on Fortran that frightens, or the dependence on 
> ATLAS (not a requirement by the way)?   How many realize that you can 
> turn off installation of packages in the setup.py file by a simple 
> switch?

Fortran or Atlas do not scare me. But if they make it more difficult to 
have some average user to install scipy, just so they can use some of 
my software that depends on it, then there is a problem. Fear of such 
problems keeps me away from scipy. I can point somebody to numarray and 
say type 'python setup.py install' and be sure it works. I know, this a 
bit unfair, scipy aims to be more then numarray is now, but it would be 
nice to have functionality that is in scipy available with the same 
ease.

> The point is that nd_image should live perfectly well as a scipy 
> package and be installed and distributed separately if that is desired 
> as well.

It should live perfectly well as a scipy package because it should be 
dependent only on a basic array package. Then scipy would be welcome to 
include it. That is now not possible because of the split between 
numarray or numeric. If scipy and numarray would share the basic array 
package, nd_image would work perfectly fine with both.

> Is it ownership that is a concern?  None of us involved with scipy 
> have an alterior motive for trying to bring packages together except 
> the creation of a single standard and infrastructure for scientific 
> computing with Python.    I keep hearing that people like the idea of 
> a small subset of packages, but that is exactly what scipy has always 
> been trying to be.   Perhaps those involved with scipy have not voiced 
> our intentions enough or loud enough.

Maybe this is the problem. My impression of scipy has been that it 
tries to be a matlab-type of enviroment, i.e., including all in a 
single install. Nothing wrong with that, but also not what I really 
need, certainly not if it is difficult to install. I just need to be 
able get the packages that I need to write my programs in python, just 
like you would just link a library if you would do it in C...

So, although I do applaud the scipy effort to provide a complete 
enviroment, I would prefer to see it based on a set of packages that 
are as independent as possible, so that I can just stick to those if I 
want. I think you have been saying that SciPy has been designed with 
that goal in mind, so maybe I should investigate how easy it is to take 
out things I  need.

>> But it seems to me that there is a danger for the "yet another 
>> package" effect to occur. I think I will remain sceptical unless you 
>> achieve three things: 1) It has the most important improvements that 
>> numarray  has. 2) It has a good API that can be used to write 
>> packages that work with Numeric3/SciPy and Numarray (the latter 
>> probably will not go away). 3) Inclusion in the main Python tree, so 
>> that it is guaranteed to be available.
>
> Thanks for your advice.  All encouragement is appreciated.
> 1)  The design document is posted.  Please let me know what "the most 
> important improvements" are.

I will have a look and give my comments. On first reading it seems to 
fix all that I did not like in Numeric.

>
> 2)  It will have an improved API and I would also like to support a 
> lot of the numarray API as well (although I I don't understand the 
> need for many of the API calls numarray allows and see much culling 
> that needs to be done --- much input is appreciated here as to which 
> are the most important API calls.  I will probably use nd_image as an 
> example of what to support).

I think if Numeric3 is intended to be a basic package that might go 
into the Pyhton core, that its API should be as small as possible. Dont 
forget that numarray already includes a Numeric API, so writing 
packages that compile on both should be feasible. I personally prefer 
to switch to whatever the API in the core may be, over seeing multiple 
API's in such a core package. Having a set of multiple APIs will not 
help getting it accepted in the core, it should be kept as simple as 
possible.

>> Jochem Küpper just outlined very well how it could look like: A small 
>> core, plus a common project with packages at different levels. I 
>> think it is a very good idea, and probably similar to what SciPy is 
>> trying to do now. But he suggests an explicit division between 
>> independent packages: basic packages, packages with external library 
>> dependencies like FFTW, and advanced packages. Maybe something like 
>> that should be set up if we get an arraybobject into the Python core.
>
> Sounds great.  SciPy has been trying to do exactly this, but we need 
> ideas --- especially from package developers who understand the issues 
> --- as to how to set it up correctly.  We've already re-factored a 
> couple of times.  We could do it again if we needed to, so that the 
> infrastructure had the right feel.   A lot of this is already in 
> place.  I don't think many recognize some of the work that has already 
> been done.  This is not an easy task.

I think a crucial difference is that there would be different packages 
that need to be installed separately, which increasing levels of 
dependency and difficulties. Enviroments such as scipy could be build 
on top of it. At the same time it would make people like me that dont 
want to install enviroments that do everything, happier.
>
> Plotting is potentially problematic because there are a lot of ways to 
> plot.  I think we need to define interfaces in this regard and 
> adapters so that commands that would throw up a plot could use several 
> different plotting methods to do it.   I'm not favoring any plotting 
> technique, so don't pre-guess me.  My ideas of plotting are probably 
> very similiar to John's with matplotlib.  His work is another that I'm 
> disappointed is not part of scipy and has led me to my current 
> craziness with Numeric3 :-)

With things like plotting you get into the realm of user interfaces. In 
my opinion such things should not mix at all with packages that 
implement numerical functionality. Obviously an environment like scipy  
needs to have plotting, but again, if I don't want it, I should not 
have to download and install it. So within a structure that consists of 
different levels of packages, plotting and displaying should probably 
consist within their own set of packages.

>> Agreed about the single standard thing. But I am not willing to just 
>> 'join' the SciPy project to achieve it (at least for nd_image). I am 
>> however very interested in the possibility of writing against a small 
>> high-quality array package that is included in the pyhton core. That 
>> would be all the standard I need. If you manage to make SciPy into a 
>> useful larger standard on top of that, great, more power to all of 
>> us!
>
> Why not?  Your goals are not at odds with ours.   O.K. it may be that 
> more work is required to re-write nd_image against the Numeric C-API 
> than you'd like --- that's why Numeric3 is going to try and support 
> the numarray C-API as well.

Rewriting is not the problem, that can be done fairly easily, as long 
as it can be done such that the result compiles for both numarray and 
numeric. I am not sure if it makes sense to have multiple APIs into 
core package, keep it simple.

> Thanks for your valuable feedback.   I appreciate the time it took you 
> to provide it.   I hope we can work more together in the future.

For now I will watch and see what happens :-) nd_image remains a 
numarray package for now, but if it becomes feasible to make it work 
for both numarray and numeric I will give that a try. If some package 
makes into the python core, then I will support that and remove all 
dependencies that might exist on other packages.

Cheers, Peter





More information about the NumPy-Discussion mailing list