[SciPy-Dev] Sparse Boolean Specification

Blake Griffith blake.a.griffith at gmail.com
Tue May 21 01:27:04 EDT 2013


Thanks for the Feedback Pauli, item by item:

* I wasn't sure how much to include about the actual implementation.
Instead I tried to explain how interactions should appear to the user. So I
left this out. To answer the question, which is related to the fourth item;
I'm still thinking about how to reduce the current code duplication that we
have. But for new functionality I will make use of a few generalized binop
functions that are row by row or column by colum for broadcasting. And a
binop function for matrices of the same size.

Currently some of the code that interfaces with the sparsetools routines
can be replaced,  and we can let SWIG typemaps handle conversion more
instead of doing it in python. I'm still not %100 sure this will work
though.

This is still a lot of thinking I need to do about this. I'll add what I
have to the spec.

* I tried defining in the spec, when a sparse matrix should be returned,
and when a dense matrix should be returned, based on what I thought was
optimal. Basically broadcasted operations and operations on sparse matrices
of the same size should return sparse matrices.  This will be similar to
how the numpy/ndarray spec will be, but the binops are boolean operations.

* I'm still playing with SWIG, and it is still a bit opaque to me, but it
seems possible that with numpy's typemaps I can get it the desired behavior
(e.g. %numpy_typemaps(bool, NPY_UINT, int)) see the numpy.i swig docs here:
http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html#other-common-types-boolI'll
add this to the spec.

* We need row by row and column by column functions. Some of these exist
for things like multiplication (see csr_scale_rows
https://github.com/scipy/scipy/blob/master/scipy/sparse/sparsetools/csr.h#L88)
working from these I could make functions that would preform some operation
based on a string it is passed.

* I have not thought about boolean indexing. I'll look at this tomorrow.

* I was not planning on implementing the mostly true matrix type. But
instead letting users shoot themselves in the foot if they really want to
have a mostly true sparse matrix... This is not really a pythonic
philosophy though.


On Sun, May 19, 2013 at 12:25 PM, Pauli Virtanen <pav at iki.fi> wrote:

> Hi,
>
> 17.05.2013 07:45, Blake Griffith kirjoitti:
> > I've been writing up how I think adding support for boolean operations
> > and the bool dtype should work. You can read the document on my GitHub,
> > here
> https://github.com/cowlicks/scipy-sparse-boolean-spec/blob/master/spec.rst
>
> I think the following things need some thought:
>
> - How to avoid code duplication between the different operations?
>
> - Dense vs. sparse matrix as a return value. Another undecided issue,
>   but it might make sense to make the return type undefined in the
>   sense that the most efficient format in used in all cases.
>
> - Figuring out how to make SWIG automatically pick the right routines
>   if input is bool-type, or whether this needs special-casing logic
>   on the Python side.
>
> - Figuring out what sort of routines would be needed to be added to
>   sparsetools for broadcasting to work.
>
> Also:
>
> - Indexing with sparse boolean matrices? This probably boils down to
>   using .nonzero() to get the nonzero indices.
>
> - There was the suggestion to use a mostly-true matrix as an output
>   for the "mostly True" operations. I'm undecided on this idea.
>
>
>         Pauli
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20130521/5e1cd66a/attachment.html>


More information about the SciPy-Dev mailing list