[SciPy-user] kmeans2 random initialization

David Warde-Farley dwf at cs.toronto.edu
Tue Apr 1 04:41:27 EDT 2008


On 31-Mar-08, at 11:17 PM, Robert Kern wrote:

> The relevant function is scipy/cluster/vq.py:_krandinit(). It is
> finding the covariance matrix and manually doing a multivariate normal
> sampling. Your data is most likely degenerate and not of full rank.
> It's arguable whether or not this should fail, but
> numpy.random.multivariate_normal() uses the SVD instead of a Cholesky
> decomposition to find the matrix square root, so it sort of ignores
> non-positive definiteness.


This might not be relevant, depending on how the covariance is  
computed, but one 'gotcha' I've seen with numerical algorithms that  
assume positive-definiteness is that occasionally floating point  
oddities will induce (very slight) non-symmetry of the input matrix,  
and thus the algorithm will choke; it's easily solved by averaging the  
matrix with it's transpose (though there are probably more efficient  
ways).

David



More information about the SciPy-User mailing list