[Numpy-discussion] future directions

Dag Sverre Seljebotn dagss at student.matnat.uio.no
Fri Aug 28 13:52:22 EDT 2009


Fons Adriaensen wrote:
> Some weeks ago there was a post on this list requesting feedback
> on possible future directions for numpy. As I was quite busy at that
> time I'll reply to it now.
>
> My POV is that of a novice user, who at the same time wants quite
> badly to use the numpy framework for his numerical work which in
> this case is related to (some rather advanced) multichannell audio
> processing.
>   
I'm reluctantly joining the discussion... (reluctant because, as 
interesting as these discussions may be, (relatively) simple things that 
everyone agrees about like Python 3 compatability and PEP 3118 support 
is still some ways off. Agreeing on things doesn't make it happen.)


> >From that POV, I'd suggest the following:
>
> 1. Adopt an object based on Python-3's buffer protocol as the
> basic array type. It's immensely more powerful than ndarray,
> while at the same time it's close enough to ndarray to allow
> a gradual adoption.
>   
It's not immensely more powerful? It allows pointers, that's right, but 
that's primarily for exporting data from data providers...

For things like "pointers to images" (which PEP 3118 could be used for), 
Python lists usually work better anyway because they can be appended.

I think the whole idea of the protocol is that you can start passing 
around data in *various* containers. Adopting a new array type as the 
"basic array type" basically defeats this purpose.

My way of thinking of it is: Focus shifted over on the NumPy library 
providing ufuncs, not array container. I think we'll in some years be 
doing np.sin(x, out=y) without x or y being ndarrays at all.

One conclusion: All of this might call for a new library which tries to 
focus more and support a wider set of memory layouts. But, well, it's 
just to go ahead and do that! -- but I don't think NumPy can be turned 
into it, nor do the NumPy developers likely have time to spare for that.

If you wait a year, such a library might be a 100-liner in Cython :-) 
Actually, I right now think the best way of getting such a library 
implemented is help out on Cython's array features, then export Cython's 
arrays to Python-space in a library.


Secondly,

One BIG gotcha people should be aware about here is that PEP 3118 
supports "fancy indexing as views".

I.e. with an object based on PEP 3118's memory model you could 
potentially do

b = a[a == 2]
b[0] = 3

and have that change a!

I believe these semantics to be superior myself (because you can always 
do "b = a[a==2].copy()" to get NumPy's behaviour).

But it does raise some interesting questions about consistency vs. 
subtle API breakage etc.

> 2. Adopting that format will make it even more important to
> clearly define in which cases data gets copied and when not.
> This should be based on some simple rules that can be evaluated
> by a code author without requiring a lookup in the reference
> docs each time.
>   
I think NumPy's already doing quite good here, except for the case of 
fancy indexing as mentioned above. Cleaning up various incarnations of 
"reshape" etc. to be consistent here would be good too (my vote is for 
never doing any automatic copying in methods like reshape, but I 
actually haven't checked what the semantics ended up being in the end).

(BTW, I was recently observed saying I might chip in and implement PEP 
3118 for NumPy around November. If anyone wants to beat me to it then 
I'd be happy of course.)

Dag Sverre



More information about the NumPy-Discussion mailing list