[SciPy-Dev] scipy.sparse versus pysparse

Pauli Virtanen pav at iki.fi
Wed Jul 23 17:09:55 EDT 2014


23.07.2014, 23:08, nicky van foreest kirjoitti:
> Sure. Please see below.   I included an extra time stamp to analyse
> the results in slightly more detail. It turns out that the
> matrix-vector multiplications are roughly the same in scipy.stats
> and pysparse, but that building the matrices in pysparse is way
> faster.

The benchmark is mainly measuring the speed of dok_matrix.__setitem__
for scalars (dok_matrix.setdiag is naive and justs sets items in a for
loop).

Neither dok_matrix or lil_matrix is very fast. This is largely limited
by the fact that they use Python dict and Python lists as data
structures, which have non-negligible overheads.

lil_matrix was optimized in Scipy 0.14.0, so you may get better
results using it (for those Scipy versions). Additionally, vectorized
assignment into sparse matrices is now supported, so further
performance improvement can be obtained by replacing the for loops in
fillOffDiagonal.

There may be some room for optimization in dok_matrix for scalar
assignment, but this is probably not more than 2x. The remaining 10x
factor vs. pysparse requires pretty much not using Python data
structures for storing the numbers.

csr, csr, bsr, and dia are OK, but the data structures are not
well-suited for matrix assembly.

-- 
Pauli Virtanen




More information about the SciPy-Dev mailing list