[SciPy-dev] feedback on scipy.sparse
Nathan Bell
wnbell at gmail.com
Thu Dec 13 11:23:23 EST 2007
On Dec 13, 2007 6:31 AM, Matthieu Brucher <matthieu.brucher at gmail.com> wrote:
> Not exactly.
> I have something like :
> a = [[0, 2, 5], [3, 4], [4, 2]]
> and then some data :
> data = [1, 2, 3, 4, 5, 6, 7] or [[1, 2, 3], [4, 5], [6, 7]]and then
> the matrix would be :
>
> [[1, 0, 2, 0, 0, 0]
> [0, 0, 0, 3, 4, 0]
> [0, 0, 7, 0, 6, 0]]
Your list of lists nearly matches the lil_matrix format (which is an
array of lists). Below is the code for lil_matrix.tocsr() which is
the most efficient way I've found to convert that format to CSR:
http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/sparse.py
2555 def tocsr(self):
2556 """ Return Compressed Sparse Row format arrays for this matrix.
2557 """
2558
2559 indptr = asarray([len(x) for x in self.rows], dtype=intc)
2560 indptr = concatenate( ( array([0],dtype=intc), cumsum(indptr) ) )
2561
2562 nnz = indptr[-1]
2563
2564 indices = []
2565 for x in self.rows:
2566 indices.extend(x)
2567 indices = asarray(indices,dtype=intc)
2568
2569 data = []
2570 for x in self.data:
2571 data.extend(x)
2572 data = asarray(data,dtype=self.dtype)
2573
2574 return csr_matrix((data, indices, indptr), dims=self.shape)
2575
Essentially, it computes the row pointer first and then flattens the
lists. If you find something faster let me know.
--
Nathan Bell wnbell at gmail.com
More information about the SciPy-Dev
mailing list