[Numpy-discussion] deprecate updateifcopy in nditer operand, flags?

Fri Nov 10 05:25:19 EST 2017

On Wed, Nov 8, 2017 at 2:13 PM, Allan Haldane <allanhaldane at gmail.com> wrote:
> On 11/08/2017 03:12 PM, Nathaniel Smith wrote:
>> - We could adjust the API so that there's some explicit operation to
>> trigger the final writeback. At the Python level this would probably
>> mean that we start supporting the use of nditer as a context manager,
>> and eventually start raising an error if you're in one of the "unsafe"
>> case and not using the context manager form. At the C level we
>> probably need some explicit "I'm done with this iterator now" call.
>>
>> One question is which cases exactly should produce warnings/eventually
>> errors. At the Python level, I guess the simplest rule would be that
>> if you have any write/readwrite arrays in your iterator, then you have
>> to use a 'with' block. At the C level, it's a little trickier, because
>> it's hard to tell up-front whether someone has updated their code to
>> call a final cleanup function, and it's hard to emit a warning/error
>> on something that *doesn't* happen. (You could print a warning when
>> the nditer object is GCed if the cleanup function wasn't called, but
>> you can't raise an error there.) I guess the only reasonable option is
>> to deprecate NPY_ITER_READWRITE and NP_ITER_WRITEONLY, and make people
>> switch to passing new flags that have the same semantics but also
>> promise that the user has updated their code to call the new cleanup
>> function.
> Seems reasonable.
>
> When people use the Nditer C-api, they (almost?) always call
> NpyIter_Dealloc when they're done. Maybe that's a place to put a warning
> for C-api users. I think you can emit a warning there since that
> function calls the GC, not the other way around.
>
> It looks like you've already discussed the possibilities of putting
> things in NpyIter_Dealloc though, and it could be tricky, but if we only
> need a warning maybe there's a way.
> https://github.com/numpy/numpy/pull/9269/files/6dc0c65e4b2ea67688d6b617da3a175cd603fc18#r127707149

Oh, hmm, yeah, on further examination there are some more options here.

I had missed that for some reason NpyIter isn't actually a Python
object, so actually it's never subject to GC and you always need to
call NpyIter_Deallocate when you are finished with it. So that's a
natural place to perform writebacks. We don't even need a warning.
(Which is good, because warnings can be set to raise errors, and while
the docs say that NpyIter_Deallocate can fail, in fact it never has
been able to in the past and none of the code in numpy or the examples
in the docs actually check the return value. Though I guess in theory
writeback can also fail so I suppose we need to start returning
NPY_FAIL in that case. But it should be vanishingly rare in practice,
and it's not clear if anyone is even using this API outside of numpy.)

And for the Python-level API, there is the option of performing the
final writeback when the iterator is exhausted. The downside to this
is that if someone only goes half-way through the iteration and then
aborts (e.g. by raising an exception), then the last round of
writeback won't happen. But maybe that's fine, or at least better than
forcing the use of 'with' blocks everywhere? If we do this then I
think we'd at least want to make sure that the writeback really never
happens, as opposed to happening at some random later point when the
Python iterator object is GCed. But I'd appreciate if anyone would
express a preference between these :-)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org