[SciPy-User] Fwd: scipy.cluster.spectral_clustering numerical issues

antonio vergari arranger1044 at gmail.com
Wed Oct 1 03:26:52 EDT 2014


Hi everyone,

I'm trying to use the implementation of the spectral clustering algorithm
found in the scipy.cluster package on a quite large data matrix (~16000
rows and 16 columns, binary data).

I am also trying to give the algorithm an already computed affinity matrix
representing a gaussian kernel (like in Andrew Ng's paper
http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf).

I am not able to user the default eigensolver 'arpack' since I do not have
enough memory and it starts constantly swapping (I have 8 Gb, but the
matrix alone is almost 2Gb). Turning to the other alternative, 'lobpcg',
I've found that errors arise. These are numerical errors spanning from a
'Not Xth minor semi definite' to the presence of Nans and 0s in the
computation while I vary the value of the k clusters to find and the sigma
value in the kernel (the variance).

I am wondering if I lack some piece of theory and/or computing the
similarity matrix in a wrong way.
I do not appear to get these errors on smaller, randomly generate, binary
matrices.
Here is a printed matrix for sigma=3.0:

    [[ 1.          0.60653066  0.94595947 ...,  0.71653131  0.8007374
       0.57375342]
     [ 0.60653066  1.          0.57375342 ...,  0.67780958  0.67780958
       0.84648172]
     [ 0.94595947  0.57375342  1.         ...,  0.75746513  0.75746513
       0.60653066]
       ...,
     [ 0.71653131  0.67780958  0.75746513 ...,  1.          0.71653131
       0.71653131]
     [ 0.8007374   0.67780958  0.75746513 ...,  0.71653131  1.
0.64118039]
     [ 0.57375342  0.84648172  0.60653066 ...,  0.71653131  0.64118039
1.        ]]

I am attaching also a piece of code and the data as a minimal working
example.

Thanks in advance
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20141001/c1dcf8eb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spec_mwe.zip
Type: application/zip
Size: 48174 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20141001/c1dcf8eb/attachment.zip>


More information about the SciPy-User mailing list