[SciPy-Dev] Fastest way to multiply a sparse matrix with another numpy array

Manoj Kumar manojkumarsivaraj334 at gmail.com
Mon Aug 11 11:04:27 EDT 2014


Hello,

I was wondering what is the fastest way (format) to multiply a sparse
matrix with a numpy array. Intuitively, a csr format multiplied with a
numpy array which is fortran contiguous seems to be the fastest, but I have
ran a few benchmarks and it seems otherwise. It is also mentioned here
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.csc_matrix.html
that using csr matrices "may" be faster.


In [5]: X
Out[5]:
<11314x130107 sparse matrix of type '<type 'numpy.float64'>'
    with 1787565 stored elements in Compressed Sparse Row format>
In [6]: _, n_features = X.shape
In [9]: w_c = np.random.rand(n_features, 10)
In [10]: w_f = np.asarray(w_c, order='f')
In [13]: csc = sparse.csc_matrix(X)
In [30]: %timeit X * w_f
10 loops, best of 3: 40.5 ms per loop

In [31]: %timeit X * w_c
10 loops, best of 3: 37.3 ms per loop

In [32]: %timeit csc *  w_c
10 loops, best of 3: 24.3 ms per loop

In [33]: %timeit csc * w_f
10 loops, best of 3: 27.3 ms per loop


It seems here, using a csc matrix is faster with a C-contiguous numpy array
which is completely non-intuitive to me. Are there any hard rules for this?
or is it data dependent?

Sorry for my noobish questions!
-- 
Regards,
Manoj Kumar,
GSoC 2014, Scikit-learn
Mech Undergrad
http://manojbits.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20140811/6351ab1d/attachment.html>


More information about the SciPy-Dev mailing list