[SciPy-dev] Ideas for scipy.sparse?

Tue Apr 15 05:43:01 EDT 2008

On Tue, Apr 15, 2008 at 1:55 AM, Nathan Bell <wnbell at gmail.com> wrote:
> On Mon, Apr 14, 2008 at 1:36 PM, Viral Shah
>  <vshah at interactivesupercomputing.com> wrote:
>  >  Some people pointed out that this tends to be slow when called in for
>  >  loops. I couldn't figure out what the cause of this was. Is it that
>  >  for loops in python are generally slow, or is it that indexing
>  >  individual elements in sparse data structures is slow or are python's
>  >  data structures slow, or some combination of all three ?
>
>  Currently all access of the form A[1,2] are implemented in pure Python
>  (even CSR/CSC).  Looping in Python comes with a certain amount of
>  overhead itself, however, I'd wager most time is spent doing the
>  actual indexing.
>
>  We can speed up the case A[[1,2,3],[5,6,7]] by sending the index
>  arrays over to sparsetools.  However, as you point out, individual
>  lookups will always be slow.
>
>
>  >  I thought initially that dok may solve this for some kinds of sparse
>  >  matrix problems, but it seems that its too slow for large problems.
>  >  Perhaps a little benchmark must be setup.
>
>  In my experience DOK/LIL are too slow for large matrices.  This is a
>  shame, because new users will want to construct matrices with one of
>  these classes.  More advanced users would probably build a COO
>  initially and then convert to CSR/CSC.

All formats are nicely documented in the sources (see the docstrings
for usage examples).

Ondrej