PCA principal component analysis

Giorgi lekishvili at python.qartu.com
Thu Apr 10 13:01:12 EDT 2003


myk <mykbourassa at cogeco.ca> wrote in message news:<hfYka.2414$h%2.256728 at read1.cgocable.net>...
> Don't really need a tool.  Just:
> 
> a)  "centre" your data (translate for zero mean and scale for unity 
> variance);
> b)  do svd (in NumPy I think) on the data set resulting from a);
> c)  eigenvectors are columns of U and eigenvalues are diagonal of S.  
> PCA scores are just the values in the columns of U i.e. 1st column is 
> first PC scores, etc.

Just a small correction:
The difference between the SVD and PCA is that the column vectors of
U, unlike to T score matrix of PCA, are normalized. In other words, we
have:

X = U*S*V (SVD)
X = T*P (PCA)

from which follows that T = U*S.

P is loading matrix used for the initial variable importance
diagnostics.

GRTZ,
Giorgi

V and P are in fact the same, as normalization has place during the
decomposition


> 
> You have much more flexibility if you don't use a "package".  HTH
> 
> majb
> 
> sebastien wrote:
> 
> >Hi,
> >
> >Is there any PCA analysis tools for python ?
> >If it does, do you have any idea on how well it would scale ?
> >
> >I have already seen PyClimate (but it is not available for Windows
> >which will be one of the target). Is there some LAPACK like packages ?
> >
> >Thanks for reading this mail.
> >
> >Sebastien.
> >  
> >




More information about the Python-list mailing list