[SciPy-dev] Ideas for scipy.sparse?

Mon Apr 14 19:55:19 EDT 2008

On Mon, Apr 14, 2008 at 1:36 PM, Viral Shah
<vshah at interactivesupercomputing.com> wrote:
>  Some people pointed out that this tends to be slow when called in for
>  loops. I couldn't figure out what the cause of this was. Is it that
>  for loops in python are generally slow, or is it that indexing
>  individual elements in sparse data structures is slow or are python's
>  data structures slow, or some combination of all three ?

Currently all access of the form A[1,2] are implemented in pure Python
(even CSR/CSC).  Looping in Python comes with a certain amount of
overhead itself, however, I'd wager most time is spent doing the
actual indexing.

We can speed up the case A[[1,2,3],[5,6,7]] by sending the index
arrays over to sparsetools.  However, as you point out, individual
lookups will always be slow.

>  I thought initially that dok may solve this for some kinds of sparse
>  matrix problems, but it seems that its too slow for large problems.
>  Perhaps a little benchmark must be setup.

In my experience DOK/LIL are too slow for large matrices.  This is a
shame, because new users will want to construct matrices with one of
these classes.  More advanced users would probably build a COO
initially and then convert to CSR/CSC.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/