[Numpy-discussion] Add guaranteed no-copy to array creation and reshape?

Sebastian Berg sebastian at sipsolutions.net
Sun Dec 30 11:23:39 EST 2018


On Sun, 2018-12-30 at 16:03 +0100, Matthias Geier wrote:
> On Sat, Dec 29, 2018 at 6:00 PM Sebastian Berg wrote:
> > On Sat, 2018-12-29 at 17:16 +0100, Matthias Geier wrote:
> > > Hi Sebastian.
> > > 
> > > I don't have an opinion (yet) about this matter, but I have a
> > > question:
> > > 
> > > On Thu, Dec 27, 2018 at 12:30 AM Sebastian Berg wrote:
> > > 
> > > [...]
> > > 
> > > > new_arr = arr.reshape(new_shape)
> > > > assert np.may_share_memory(arr, new_arr)
> > > > 
> > > > # Which is sometimes -- but should not be -- written as:
> > > > arr.shape = new_shape  # unnecessary container modification
> > > 
> > > [...]
> > > 
> > > Why is this discouraged?
> > > 
> > > Why do you call this "unnecessary container modification"?
> > > 
> > > I've used this idiom in the past for exactly those cases where I
> > > wanted to make sure no copy is made.
> > > 
> > > And if we are not supposed to assign to arr.shape, why is it
> > > allowed
> > > in the first place?
> > 
> > Well, this may be a matter of taste, but say you have an object
> > that
> > stores an array:
> > 
> > class MyObject:
> >     def __init__(self):
> >         self.myarr = some_array
> > 
> > 
> > Now, lets say I do:
> > 
> > def some_func(arr):
> >     # Do something with the array:
> >     arr.shape = -1
> > 
> > myobject = MyObject()
> > some_func(myobject)
> > 
> > then myobject will suddenly have the wrong shape stored. In most
> > cases
> > this is harmless, but I truly believe this is exactly why we have
> > views
> > and why they are so awesome.
> > The content of arrays is mutable, but the array object itself
> > should
> > not be muted normally.
> 
> Thanks for the example! I don't understand its point, though.
> Also, it's not working since MyObject doesn't have a .shape
> attribute.
> 

The example should have called `some_func(myobject.arr)`. The thing is
that if you have more references to the same array around, you change
all their shapes. And if those other references are there for a reason,
that is not what you want.

That does not matter much in most cases, but it could change the shape
of an array in a completely different place then intended. Creating a
new view is cheap, so I think such things should be avoided.

I admit, most code will effectively do:
arr = input_arr[...]  # create a new view
arr.shape = ...

so that there is no danger. But conceptually, I do not think there
should be a danger of magically changing the shape of a stored array in
a different part of the code.

Does that make some sense? Maybe shorter example:

arr = np.arange(10)
arr2 = arr
arr2.shape = (5, 2)

print(arr.shape)  # also (5, 2)

so the arr container (shape, dtype) is changed/muted. I think we expect
that for content here, but not for the shape.

- Sebastian


> > There may be some corner cases, but a lot of the
> > "than why is it allowed" questions are answered with: for history
> > reasons.
> 
> OK, that's a good point.
> 
> > By the way, on error the `arr.shape = ...` code currently creates
> > the
> > copy temporarily.
> 
> That's interesting and it should probably be fixed.
> 
> But it is not reason enough for me not to use it.
> I find it important that is doesn't make a copy in the success case,
> I
> don't care very much for the error case.
> 
> Would you mind elaborating on the real reasons why I shouldn't use
> it?
> 
> cheers,
> Matthias
> 
> > - Sebastian
> > 
> > 
> > > cheers,
> > > Matthias
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181230/bf9883ee/attachment.sig>


More information about the NumPy-Discussion mailing list