[Numpy-discussion] HPC missing data - was: NA/Missing Data Conference Call Summary

Mark Wiebe mwwiebe at gmail.com
Wed Jul 6 15:43:27 EDT 2011


On Wed, Jul 6, 2011 at 8:12 AM, Dag Sverre Seljebotn <
d.s.seljebotn at astro.uio.no> wrote:

> <snip>
> I just commented on the "prevent direct API access to the masking array"
> part -- I'm hoping direct access by external code to the underlying
> implementation details will be allowed, at some point.
>

I think direct or nearly direct access needs to be in right away, unless
we're fairly sure that we will change low level implementation details in
the near future. I've added "Python API" and "C API" definitions for us to
use to try and clear up this kind of potential confusion.

-Mark


> What I'm saying is that Mark's proposal is more flexible. Say for the
> sake of the argument that I have two codes I need to interface with:
>
>  - Library A is written in Fortran and uses a seperate (explicit) mask
> array for NA
>
>  - Library B runs on a GPU and uses a bit pattern for NA
>
> Mark's proposal then comes closer to allowing me to wrap both codes
> using NumPy, since it supports both implementation mechanisms. Sure, it
> would need a seperate NEP down the road to extend it, but it goes in the
> right direction for this to happen.
>
> As for NA vs. IGNORE I still think 2 types is too little. One should
> allow for 255 different NA-values, each with user-defined behaviour.
> Again, Mark's proposal then makes a good start on that, even if more
> work would be needed to make it happen.
>
> I.e., in my perfect world I'd do this to wrap library A (Cythonish
> psuedo-code:
>
> def call_lib_A():
>     ...
>     lib_A_function(arraybuf, maskbuf, ...)
>     DOG_ATE_IT = np.NA("DOG_ATE_IT", value=42, behaviour="raise")
>     # behaviour could also be "zero", "invalid"
>     missing_value_map = {0xAF: np.NA, 0x43: np.IGNORE, 0xF0: DOG_ATE_IT}
>     result = np.PyArray_CreateArrayFromBufferWithMaskBuffer(
>         arraybuf, maskbuf, missing_value_map, ...)
>     return result
>
> def call_lib_B():
>     lib_B_function(arraybuf, ...)
>     missing_value_patterns = {0xFFFFCACA : np.NA}
>     result = np.PyArray_CreateArrayFromBufferWithBitPattern(
>         arraybuf, maskbuf, missing_value_patterns, ...)
>     return result
>
> Hope that is clearer. Again, my intention is not to suggest even more
> work at the present stage, just to state some advantages with the
> general direction of Mark's proposal.
>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110706/991f7873/attachment.html>


More information about the NumPy-Discussion mailing list