[Numpy-discussion] Scipy 2017 NumPy sprint

josef.pktd at gmail.com josef.pktd at gmail.com
Sat Jul 8 05:58:13 EDT 2017


On Fri, Jul 7, 2017 at 6:42 PM, Ryan May <rmay31 at gmail.com> wrote:

> On Fri, Jul 7, 2017 at 4:27 PM, Marten van Kerkwijk <
> m.h.vankerkwijk at gmail.com> wrote:
>
>> Hi All,
>>
>> I doubt I'm really the last one thinking ndarray subclassing is a good
>> idea, but as that was stated, I feel I should at least pipe in. It
>> seems to me there is both a perceived problem -- with the two
>> subclasses that numpy provides -- `matrix` and `MaskedArray` -- both
>> being problematic in ways that seem to me to have very little to do
>> with subclassing being a bad idea, and a real one following from the
>> fact that numpy was written at a time when python's inheritance system
>> was not as well developed as it is now.
>>
>> Though based on my experience with Quantity, I'd also argue that the
>> more annoying problems are not so much with `ndarray` itself, but
>> rather with the helper functions.  Ufuncs were not so bad -- they
>> really just needed a better override mechanism, which __array_ufunc__
>> now provides -- but for quite a few of the other functions subclassing
>> was clearly an afterthought. Indeed, `MaskedArray` provides a nice
>> example of this, with its many special `np.ma.<function>` routines,
>> providing huge duplication and thus lots of duplicated bugs (which
>> Eric has been patiently fixing...). Indeed, `MaskedArray` is also a
>> much better example than ndarrat of a class that is really hard to
>> subclass (even though, conceptually, it should be a far easier one).
>>
>> All that said, duck-type arrays make a lot of sense, and e.g. the
>> slicing and shaping methods are easily emulated, especially if one's
>> underlying data are stored in `ndarray`. For astropy's version of a
>> relevant mixin, see
>> http://docs.astropy.org/en/stable/api/astropy.utils.misc.Sha
>> pedLikeNDArray.html
>
>
> My biggest problem with subclassing as it exists now is that they don't
> survive the first encounter with np.asarray (or np.array). So much code
> written to work with numpy uses that as a bandaid (for e.g. handling lists)
> that in my experience it's 50/50 whether passing a subclass to a function
> will actually behave as expected--even if there's no good reason it
> shouldn't.
>

as a downstream developer:
The problem is that we cannot trust any array subclass or anything that
pretends to be like an array. Even asarray is letting already too many
things go through.
We would need an indication or guarantee for the behavior to quack in the
correct way, otherwise it is very difficult to write code that would work
for various subclasses.

(even in the simplest case, writing code that works for matrix and arrays
beyond a few lines is getting difficult.)

scipy.stats.mstats is largely not code duplication, it needs to handle the
mask (although the nan versions in scipy.stats are catching up).

Josef





>
>
> Ryan
>
> --
> Ryan May
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170708/5199c73b/attachment-0001.html>


More information about the NumPy-Discussion mailing list