[SciPy-dev] Ideas for scipy.sparse?
Ondrej Certik
ondrej at certik.cz
Tue Apr 15 05:43:01 EDT 2008
On Tue, Apr 15, 2008 at 1:55 AM, Nathan Bell <wnbell at gmail.com> wrote:
> On Mon, Apr 14, 2008 at 1:36 PM, Viral Shah
> <vshah at interactivesupercomputing.com> wrote:
> > Some people pointed out that this tends to be slow when called in for
> > loops. I couldn't figure out what the cause of this was. Is it that
> > for loops in python are generally slow, or is it that indexing
> > individual elements in sparse data structures is slow or are python's
> > data structures slow, or some combination of all three ?
>
> Currently all access of the form A[1,2] are implemented in pure Python
> (even CSR/CSC). Looping in Python comes with a certain amount of
> overhead itself, however, I'd wager most time is spent doing the
> actual indexing.
>
> We can speed up the case A[[1,2,3],[5,6,7]] by sending the index
> arrays over to sparsetools. However, as you point out, individual
> lookups will always be slow.
>
>
> > I thought initially that dok may solve this for some kinds of sparse
> > matrix problems, but it seems that its too slow for large problems.
> > Perhaps a little benchmark must be setup.
>
> In my experience DOK/LIL are too slow for large matrices. This is a
> shame, because new users will want to construct matrices with one of
> these classes. More advanced users would probably build a COO
> initially and then convert to CSR/CSC.
All formats are nicely documented in the sources (see the docstrings
for usage examples).
Ondrej
More information about the SciPy-Dev
mailing list