[Numpy-discussion] alterNEP - was: missing data discussion round 2

Charles R Harris charlesr.harris at gmail.com
Thu Jun 30 21:10:04 EDT 2011


On Thu, Jun 30, 2011 at 6:02 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Thu, Jun 30, 2011 at 9:01 PM, Lluís <xscript at gmx.net> wrote:
> > Matthew Brett writes:
> >
> >> Hi,
> >> On Thu, Jun 30, 2011 at 7:27 PM, Lluís <xscript at gmx.net> wrote:
> >>> Matthew Brett writes:
> >>> [...]
> >>>> I'm afraid, like you, I'm a little lost in the world of masking,
> >>>> because I only need the NAs.  I was trying to see if I could come up
> >>>> with an API that picked up some of the syntactic convenience of NAs,
> >>>> without conflating NAs with IGNOREs.   I guess we need some feedback
> >>>> from the 'NA & IGNORE Share the API' (NISA?) proponents to get an idea
> >>>> of what we've missed.  @Mark, @Chuck, guys - what have we lost here by
> >>>> separating the APIs?
> >>>
> >>> As I tried to convey on my other mail, separating both will force you
> to
> >>> either:
> >>>
> >>> * Make a copy of the array before passing it to another routine
> (because
> >>>  the routine will assign np.NA but you still want the original data)
> >
> >> You have an array 'arr'.   The array does support NAs, but it doesn't
> >> have a mask.  You want to pass ``arr`` to another routine ``func``.
> >> You expect ``func`` to set NAs into the data but you don't want
> >> ``func`` to modify ``arr`` and you don't want to copy ``arr`` either.
> >> You are saying the following:
> >
> >> "with the fused API, I can make ``arr`` be a masked array, and pass it
> >> into ``func``, and know that, when func sets elements of arr to NA, it
> >> will only modify the mask and not the underlying data in ``arr``."
> >
> > Yes.
> >
> >
> >> It does seem to me this is a very obscure case.  First, ``func`` is
> >> modifying the array but you want an unmodified array back.  Second,
> >> you'll have to do some view trick to recover the not-NA case to arr,
> >> when it comes back.
> >
> > I know, the example is just silly and convoluted.
> >
> >
> >> It seems to me, that what ``func`` should do, if it wants you to be
> >> able to unmask the NAs, is to make a masked array view of ``arr``, and
> >> return that.   And indeed the simplicity of the separated API
> >> immediately makes that clear - in my view at least.
> >
> > I agree on this example. My only concern is on the API's ability to
> > foresee as most future use-cases as possible, without impacting
> > performance.
>
> But, of course, there's a great danger in trying to cover every
> possible use-case.
>
> My argument is that the kind of cases that you are describe are - I
> believe - very rare and are even a little difficult to make up.  Is
> that fair?
>
> To my mind, the separate NA and IGNORE API is easier to understand and
> explain.   If that isn't true, please do say, and say why - because
> that point is key.
>
>
I think the main problem is that they aren't separate, one takes place in a
view of an unmasked array, the other starts with a masked array. These
aren't 'different' in mechanism, they are just different in work flow. And I
think they fit in well with the view idea.


> If it is true that the separate API is clearer, then the benefit in
> terms of power and extensibility has to be large, in order to go for
> the fused API.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110630/9d2a2d91/attachment.html>


More information about the NumPy-Discussion mailing list