[Numpy-discussion] Re: Numeric3 PEP

Tim Hochberg tim.hochberg at cox.net
Mon Feb 21 07:20:02 EST 2005


Robert Kern wrote:

> Tim Hochberg wrote:
>
>>
>> Hi Travis,
>>
>> First off, let me say that I'm encouraged to see some action towards 
>> unifying Numeric/Numarray the split has been somewhat dismaying. 
>> Thank you for your efforts in this regard.
>>
>> I'd like to lobby against flatten(), r() and i(). To a large extent, 
>> these duplicate the functionality of flat, real and imag. And, these 
>> three methods are defined to sometimes return copies and sometimes 
>> return views. That type of interface is a recipe for errors and 
>> should only be used as a last resort. 
>
>
> There is, however, a blisteringly common use case for such an 
> interface: you are using the result directly in an expression such 
> that it is only going to be read and never written to. In that case, 
> you want it to never fail (except in truly pathological cases like 
> being out of memory), and you want it to be as efficient as possible 
> and so never produce a copy where you can produce a view.
>
> So, I think we need three interfaces for each of this kind of attribute:
>
> 1) Getting a view. If a view cannot be obtained, raise an error. Never 
> copy.

The proposal for flat is to always return a view, if the array is not 
contiguous a special view-of-a-discontiguous-array will be returned. 
This special object obviously be less efficient to index than a 
contiguous array, but if implemented carefully could probably be made 
more efficient than a copy plus indexing for one-shot uses (i.e., in an 
expression).

>
> 2) Getting a copy. Never return a view.
>
> 3) Getting *something* the most efficient way possible. Caller beware.

By this you mean return a view if contiguous, a copy if not, correct?

> While I lean towards making the syntaxes for the first two The Most 
> Obvious Ways To Do It, I think it may be rather important to keep the 
> syntax of the third convenient and short, particularly since it is 
> that case that usually occurs in the middle of already-complicated 
> expressions.

Personally I don't find this all that compelling. Primarily because the 
blanket statement that (3) is more efficient than (1) is extremely 
suspect. In can many, if not most, cases (1) [as clarified above] will 
be more efficient than (3) anyway. Our disagreement here may stem from 
having different understandings of the proposed behaviour for flat.

There are cases where you probably *do* want to copy if discontiguous, 
but not in expressions. I'm thinking of cases where you are going to be 
reusing some flattened slice of a larger array multiple times, and you 
plan to use it read only (already that's sounding pretty rare though). 
In that case, my preferred spelling is:

a_contig_flat_array = ascontiguous(an_array).flat

In other words, I'd like to segregate all of the sometimes-copy 
behaviour into functions called asXXX, so that it's easy to see and 
easier to debug when it goes wrong. Of course, ascontiguous doesn't 
exist at present, as far as I can tell, but it'd be easy enough to add:

def ascontiguous(a):
    a = asarray(a)
    if not a.iscontiguous():
        a = a.copy()
    return a


Regards,

-tim








More information about the NumPy-Discussion mailing list