[SciPy-user] arbitrary submatrices of a sparse matrix?

Nathan Bell wnbell at gmail.com
Wed Dec 26 00:42:21 EST 2007


David Warde-Farley <dwf <at> cs.toronto.edu> writes:

> > I've figured out how to do this for full matrices (either x
> > [[4,6,2,7],:][:,1,7,3]  or by playing with take()) but I'm at a loss
> > for how to do it with a sparse matrix, if it's even possible (not
> > sure how matlab pulls it off).
> 
> If there's some easy way to permute the rows and columns of a CSR/CSC  
> matrix then I would be able to get the job done with slices... Is  
> there? Or is there some way to do this that I'm not seeing?

You can use sparse matrix multiplication can to apply 
permutations and slice rows:

In [1]: from scipy import *
In [2]: from scipy.sparse import *
In [3]: A = csr_matrix(arange(12).reshape(4,3))
In [5]: A.todense()
Out[5]: 
matrix([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])
In [7]: P = coo_matrix( ([1,1,1,1],[[0,1,2,3],[3,1,0,2]]) )
In [9]: (P*A).todense()
Out[9]: 
matrix([[ 9, 10, 11],
        [ 3,  4,  5],
        [ 0,  1,  2],
        [ 6,  7,  8]])
In [10]: P = coo_matrix( ([1,1,1,1],[[0,1,2,3],[0,0,3,3]]) )
In [11]: (P*A).todense()
Out[11]: 
matrix([[ 0,  1,  2],
        [ 0,  1,  2],
        [ 9, 10, 11],
        [ 9, 10, 11]])

The only downside of this approach is that it requires at least O(M) 
operations for an M,N sparse matrix in CSR format.  For permutations 
this is not a problem, and I believe you'll get near-optimal performance.
 However, if you only want to slice a few rows, then a special purpose 
approach is needed

I'm working on extending the work Robert has done to the sort of indexing 
you describe.  Keep in mind that the underlying matrix format limits the 
efficiency of slicing.  Slicing rows of a CSR or columns of a CSC is fast.
Slicing columns of a CSR is an O( A.nnz ) operation, which is essentially 
as bad as converting to CSC and then slicing.  I believe MATLAB uses CSC to
 represent sparse matrices, so one would expect A(:,10) to be faster than
A(10,:).  Here's a simple MATLAB example:

>> A = gallery('poisson',500);
>> tic; S = A(:,10); toc;
Elapsed time is 0.000060 seconds.
>> tic; S = A(10,:); toc;
Elapsed time is 0.063538 seconds.




More information about the SciPy-User mailing list