[SciPy-User] code for multidimensional scaling?
denis
denis-bz-gg at t-online.de
Sun Jan 9 10:54:31 EST 2011
Brian, folks,
here's a simple mdscale in 10 lines, using np.linalg.eigh: fast
enough for N up to 1000 or so.
What are your N, dim, p ?
(Two threads, one on MDS and another on Min-cost Path, got merged here
in google groups ??)
cheers
-- denis
# Multidim scaling using np.linalg.eigh
from __future__ import division
import numpy as np
__date__ = "2011-01-09 Jan denis"
def mdscale_eigh( D, p=2 ):
""" in: D pairwise distances, |Xi - Xj|
out: M pvecs N x p with |Mi - Mj| ~ D (mod flip, rotate)
uses np.linalg.eigh
"""
D2 = D**2
av = D2.mean( axis=0 )
B = -.5 * ((D2 - av) .T - av + D2.mean())
evals, evecs = np.linalg.eigh( B ) # few biggest evals / evecs ?
pvals = evals[-p:] ** .5
pvecs = evecs[:,-p:]
M = pvecs * pvals
return M - M.mean( axis=0 )
#...............................................................................
if __name__ == "__main__":
import sys
import scipy.spatial.distance as ssdist # $scipy/spatial/
distance.py
N = 1000
dim = 3
p = 2
plot = 0
seed = 1
exec "\n".join( sys.argv[1:] ) # run this.py N=1000 dim=3 p=2:
19 sec
np.set_printoptions( 1, threshold=100, edgeitems=5,
suppress=True )
np.random.seed(seed)
print "\nN=%d dim=%d p=%d" % (N, dim, p)
X = np.random.uniform( -10, 10, size=(N,dim) )
X -= X.mean( axis=0 )
D = ssdist.squareform( ssdist.pdist( X, "euclidean" ))
def pa( A, name="" ):
""" print A summary: mean histo etc. """
histo = np.histogram( A, 5 )
print "%s av %.2g histo %s edges %s" % (
name, A.mean(), histo[0], histo[1])
# print A.round().astype(int)
pa( D, "D" )
#...............................................................................
M = mdscale_eigh( D, p )
M -= M.mean( axis=0 ) # ?
print "M eigenvectors:\n", M.T.round().astype(int)
mdist = ssdist.squareform( ssdist.pdist( M, "euclidean" ))
pa( mdist, "mdist" )
On Jan 7, 12:30 pm, Brian Murphy <brian.mur... at unitn.it> wrote:
> Hi,
>
> I'm new to the list, so I hope my question is appropriate. I'm looking
> for code that implements multi-dimensional scaling (e.g. like Matlab's
> mdscale command) in Python.
More information about the SciPy-User
mailing list