[SciPy-dev] sparse comments and questions

Stefan van der Walt stefan at sun.ac.za
Sat Jul 14 11:10:16 EDT 2007


On Sat, Jul 07, 2007 at 06:49:43PM -0700, Nathan Bell wrote:
> 3) Should we allow modifications to CSR and CSC matrices that force
> O(N) updates/reallocations.  For instance, adding a new value into a
> CSR matrix is essentially as expensive as building a new one from
> scratch.  Could we instead prohibit such modifications and raise an
> exception informing the user that lil_matrix is the proper format for
> such operations?  Note that changing an existing matrix value is not
> so costly (typically O(1)), so not all calls to __setitem__ would be
> banned.

Instead of throwing an error, you could also make use of the warnings
module, i.e.

In [1]: import warnings

In [2]: class SparseOrderWarning(Warning): pass

In [3]: warnings.simplefilter('always', SparseOrderWarning)

In [4]: warnings.warn("Attempted O(n) operation, please use lil-format", SparseOrderWarning)
/home/stefan/lib/python2.5/site-packages/IPython/FakeModule.py:1: SparseOrderWarning: Attempted O(n) operation, please use lil-format

Then the user can always set these to errors by doing:

In [6]: warnings.simplefilter('error', SparseOrderWarning)

In [7]: warnings.warn("Attempted O(n) opreation, please use lil-format", SparseOrderWarning)
---------------------------------------------------------------------------
SparseOrderWarning                        Traceback (most recent call last)
...

> 6) When working with sparse matrices, do people expect the result to
> be of the same type?  For example, it's far more efficient to make
> transpose(csr_matrix) output a csc_matrix (as is currently done in
> SciPy) since that's a O(1) operation (as opposed to an O(n) cost
> otherwise).  So in general, should a method to return the result in
> most efficient output format, relying on the user to convert it to
> another format if so desired?  I.e. if you need a csr_matrix at a
> given point, then you should call .tocsr() to be sure.  If it's
> already a csr_matix, then no conversion will be done.

I don't think users should make any assumptions about the output type,
since we can't provide both optimal behaviour and identical
input/output types.  On the other hand, we should guarantee that an
operation is done in the fewest steps necessary.  Like you said, the
user is still free to explicitly force the output type.

> 7)  Where do functions that operate on sparse matrices belong?  For

Should we also look at adding iterative sparse solvers?

Regards
Stéfan



More information about the SciPy-Dev mailing list