[Numpy-discussion] Re: Numeric3 PEP
Tim Hochberg
tim.hochberg at cox.net
Mon Feb 21 07:20:02 EST 2005
Robert Kern wrote:
> Tim Hochberg wrote:
>
>>
>> Hi Travis,
>>
>> First off, let me say that I'm encouraged to see some action towards
>> unifying Numeric/Numarray the split has been somewhat dismaying.
>> Thank you for your efforts in this regard.
>>
>> I'd like to lobby against flatten(), r() and i(). To a large extent,
>> these duplicate the functionality of flat, real and imag. And, these
>> three methods are defined to sometimes return copies and sometimes
>> return views. That type of interface is a recipe for errors and
>> should only be used as a last resort.
>
>
> There is, however, a blisteringly common use case for such an
> interface: you are using the result directly in an expression such
> that it is only going to be read and never written to. In that case,
> you want it to never fail (except in truly pathological cases like
> being out of memory), and you want it to be as efficient as possible
> and so never produce a copy where you can produce a view.
>
> So, I think we need three interfaces for each of this kind of attribute:
>
> 1) Getting a view. If a view cannot be obtained, raise an error. Never
> copy.
The proposal for flat is to always return a view, if the array is not
contiguous a special view-of-a-discontiguous-array will be returned.
This special object obviously be less efficient to index than a
contiguous array, but if implemented carefully could probably be made
more efficient than a copy plus indexing for one-shot uses (i.e., in an
expression).
>
> 2) Getting a copy. Never return a view.
>
> 3) Getting *something* the most efficient way possible. Caller beware.
By this you mean return a view if contiguous, a copy if not, correct?
> While I lean towards making the syntaxes for the first two The Most
> Obvious Ways To Do It, I think it may be rather important to keep the
> syntax of the third convenient and short, particularly since it is
> that case that usually occurs in the middle of already-complicated
> expressions.
Personally I don't find this all that compelling. Primarily because the
blanket statement that (3) is more efficient than (1) is extremely
suspect. In can many, if not most, cases (1) [as clarified above] will
be more efficient than (3) anyway. Our disagreement here may stem from
having different understandings of the proposed behaviour for flat.
There are cases where you probably *do* want to copy if discontiguous,
but not in expressions. I'm thinking of cases where you are going to be
reusing some flattened slice of a larger array multiple times, and you
plan to use it read only (already that's sounding pretty rare though).
In that case, my preferred spelling is:
a_contig_flat_array = ascontiguous(an_array).flat
In other words, I'd like to segregate all of the sometimes-copy
behaviour into functions called asXXX, so that it's easy to see and
easier to debug when it goes wrong. Of course, ascontiguous doesn't
exist at present, as far as I can tell, but it'd be easy enough to add:
def ascontiguous(a):
a = asarray(a)
if not a.iscontiguous():
a = a.copy()
return a
Regards,
-tim
More information about the NumPy-Discussion
mailing list