[SciPy-Dev] RFC: sparse DOK array
Evgeni Burovski
evgeny.burovskiy at gmail.com
Mon Mar 28 17:37:51 EDT 2016
Thanks Stephan,
> A few other things small things I'd like to see:
> - Support for slicing, even if it's expensive.
Slicing is on the TODO list. Only needs a bit of plumbing work.
> - A strict way to set the shape without automatic expansion, if desired
> (e.g., if shape is provided in the constructor).
You can set the initial shape in the constructor. Or you mean a flag
to freeze the shape once it's set and have
`__setitem__(out-of-bounds)` raise an error?
> - Default to the dtype of the fill_value. NumPy does this for np.full.
Thanks for the suggestion --- done and implemented!
>> * Data types and casting rules. For now, I basically piggy-back on
>> numpy's rules.
>> There are several slightly different ones (numba has one?), and there
>> might be
>> an opportunity to simplify the rules. OTOH, inventing one more subtly
>> different
>> set of rules might be a bad idea.
>
>
> Yes, please follow NumPy.
One thing I'm wondering is the numpy rule that scalars never upcast
arrays. Is it something people actually rely on? [In my experience, I
only had to work around it, but my experience might be singular.]
> You could actually use a mix of __array_prepare__ and __array_wrap__ to make
> (non-generalized) ufuncs work, e.g., for functions like np.sin:
>
> - In __array_prepare__, return the non-fill values of the array concatenated
> with the fill value.
> - In __array_wrap__, reshape all but the last element to build a new sparse
> array, using the last element for the new fill value.
>
> This would be a neat trick and get you most of what you could hope for from
> __numpy_ufunc__.
This is really neat indeed! I've flagged it in
https://github.com/ev-br/sparr/issues/35
At the moment, I'm dealing with something much less cool, which I
suspect I'm not the first one: given m a MapArray and csr a
scipy.sparse matrix,
- m * csr produces a MapArray holding the result of the elementwise
multiplication, but
- csr * m fails with the dimension mismatch error when dimensions are
OK for elementwise multiply but not matrix multiply. The failure is
somewhere in the scipy.sparse code.
I tried playing with __array_priority__, but so far I did not manage
to convince scipy.sparse matrices to defer cleanly to the right-hand
multiplier (left-hand multiplier is OK).
Cheers,
Evgeni
More information about the SciPy-Dev
mailing list