[Numpy-discussion] Response to PEP suggestions

Thu Feb 17 15:54:27 EST 2005

On Feb 17, 2005, at 7:52 PM, Travis Oliphant wrote:

>
> I'm glad to get the feedback.
>
> 1) Types
>
> I like Francesc's suggestion that .typecode return a code and .type 
> return a Python class.   What is the attitude and opinion regarding 
> the use of attributes or methods for
> this kind of thing?  It always seems to me so arbitrary as to what is 
> an attribute or what
> is a method.

I don't think it really matters. Attributes seem natural, shape is an 
attribute for instance, so why not type? In the end, I don't care.

> There will definitely be support for the nummary-style type 
> specification.   Something like that will be how they print (I like 
> the 'i4', 'f4', specification a bit better though). There will also be 
> support for specification in terms of a c-type.  The typecodes will 
> still be there, underneath.

sounds fine to me.

>
> One thing has always bothered me though.  Why is a double complex type 
> Complex64? and a float complex type Complex32.  This seems to break 
> the idea that the number at the end specifies a bit width.   Why don't 
> we just call it Complex64 and Complex128?  Can we change this?

I actually find the current approach natural. You specify the width of 
the real and the imaginary type, which are some  kind of a double type. 
Again, in the end I would not care.

> I'm also glad that some recognize the problems with always requiring 
> specification of types in terms of bit-width or byte-widths as these 
> are not the same across platforms.  For some types (like Int8 or 
> Int16) this is not a problem.   But what about long double?  On an 
> intel machine long double is Float96 while on a PowerPC it is 
> Float128.   Wouldn't it just be easier to specify LDouble or 'g' then 
> special-case your code?

long double is a bit of a special case. I guess I would probably not 
use it anyway. The point is indeed that having things like LDouble is 
'a good thing'.

> Problems also exist when you are interfacing with hardware or other C 
> or Fortran code.  You know you want single-precision floating point.  
> You don't know or care what the bit-width is.    I think with the 
> Integer types the bit-width specification is more important than 
> floating point types.  In sum, I think it is important to have the 
> ability to specify it both ways.

I completely agree with this. I probably dont care for floating point, 
it is good enough to distinguish between single and double precision. 
Integer types are a different story, you want to be a bit more precise 
then. Having both solves the problem quite well.

>   When printing the array, it's probably better if it gives bit-width 
> information.  I like the way numarray prints arrays.

Agreed.

>
>
> 2) Multidimensional array indexing.
>
> Sometimes it is useful to select out of an array some elements based 
> on it's linear (flattened) index in the array.   MATLAB, for example, 
> will allow you to take a three-dimensional array and index it with a 
> single integer based on it's Fortran-order:  x(1,1,1),  x(2,1,1), ...
>
> What I'm proposing would have X[K] essentially equivalent to 
> X.flat[K].  The problem with always requiring the use of X.flat[K] is 
> that X.flat does not work for discontiguous arrays.   It could be made 
> to work if X.flat returned some kind of specially-marked array, which 
> would then have to be checked every time indexing occurred for any 
> array.  Or, there maybe someway to have X.flat return an "indexable 
> iterator" for X which may be a more Pythonic thing to do anyway.  That 
> could solve the problem and solve the discontiguous X.flat problem as 
> well.

But possibly slow, and that we want to avoid.

> If we can make X.flat[K] work for discontiguous arrays, then I would 
> be very happy to not special-case the single index array but always 
> treat it as a 1-tuple of integer index arrays.

Speed will be an issue.

> Capping indexes was proposed because of what numarray does.   I can 
> only think that the benefit would be that you don't have to check for 
> and raise an error in the middle of an indexing loop or pre-scan the 
> indexes.  But, I suppose this is unavoidalbe, anyway.  Currently 
> Numeric allows specifying indexes that are too high in slices. It just 
> chops them.  Python allows this too, for slices.  So, I guess I'm just 
> specifying Python behavior.  Of course indexing with an integer that 
> is too large or too small will raise errors:
>
> In Python:
>
> a = [1,2,3,4,5]
> a[:20]   works
> a[20] raises an error.

Probably better to stick to Python behavior.