[SciPy-user] fastest way to populate sparse matrix?

Nathan Bell wnbell at gmail.com
Wed Dec 10 16:46:51 EST 2008


On Wed, Dec 10, 2008 at 4:18 PM, Peter Skomoroch
<peter.skomoroch at gmail.com> wrote:
> Nathan,
>
> Thanks for the pointer, I had missed that wiki page.

It's fairly recent, so don't feel bad :)

>
> The bottleneck now seems to be this for-loop, which takes the majority
> of the remaining time (1.82258105278 seconds):
>
>    for index, (i,j) in enumerate(nonzero_indices):
>        data[index] = dot(W[i,:],H[:,j])
>
> Is there a better approach for this assignment block?
>

You could vectorize the loop:

W = random([n,r]).astype(float32)
H = random([m,r]).astype(float32) # note, shape is (m,r)

I,J = V.nonzero()
X = (W[I,:] * H[J,:]).sum(axis=1)
V_approx = sparse.coo_matrix((X,(I,J)), shape=(n,m))


If memory usage of the above is too costly, you could use the same
approach, but on fixed-sized chunks of the arrays.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/



More information about the SciPy-User mailing list