[SciPy-dev] sparsetools - new and (hopefully) improved!

Tue Jan 9 11:54:17 EST 2007

On 1/9/07, Robert Cimrman <cimrman3 at ntc.zcu.cz> wrote:
> My matrices tend to be much denser (tens to hundreds nonzeros per row),
> but the python FE code is not working right now (new gcc + swig issues
> after an upgrade) so let's call it a success for now... :)

Well, for an NxM matrix with R nonzeros per row, the old method is
O( N*R + N + M)
while row-by-row sorting is
O(N * R log R + M)
so asymptotically the old method wins (for sufficiently large R).
However, even for R=1000 my testing shows the latter to still be
noticeably faster (see below).  For the common case (R < 100), the row
by row sort is ~5-8x faster.  While these tests aren't comprehensive,
I expect the row by row sort to maintain an advantage in all
(practical) cases of interest.

from scipy import *
N = 10000

#R = 1000
A = sparse.spdiags(rand(1000,N),arange(1000)-500,N,N)

In [5]: time B = A.tocsr().tocsc()
CPU times: user 2.12 s, sys: 0.84 s, total: 2.96 s
Wall time: 2.98
In [6]: time A.ensure_sorted_indices(True)
CPU times: user 0.89 s, sys: 0.00 s, total: 0.89 s
Wall time: 0.89

#R = 50
A = sparse.spdiags(rand(50,N),arange(50)-25,N,N)

In [30]: time B = A.tocsr().tocsc()
CPU times: user 0.11 s, sys: 0.06 s, total: 0.17 s
Wall time: 0.18
In [31]: time A.ensure_sorted_indices(True)
CPU times: user 0.03 s, sys: 0.00 s, total: 0.03 s
Wall time: 0.03

-- 
Nathan Bell wnbell at gmail.com