[Numpy-discussion] performance matrix multiplication vs. matlab

Tue Jun 9 03:54:36 EDT 2009

Robin wrote:
> On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farley<dwf at cs.toronto.edu> wrote:
>   
>> On 8-Jun-09, at 8:33 AM, Jason Rennie wrote:
>>
>> Note that EM can be very slow to converge:
>>
>> That's absolutely true, but EM for PCA can be a life saver in cases where
>> diagonalizing (or even computing) the full covariance matrix is not a
>> realistic option. Diagonalization can be a lot of wasted effort if all you
>> care about are a few leading eigenvectors. EM also lets you deal with
>> missing values in a principled way, which I don't think you can do with
>> standard SVD.
>>
>> EM certainly isn't a magic bullet but there are circumstances where it's
>> appropriate. I'm a big fan of the ECG paper too. :)
>>     
>
> Hi,
>
> I've been following this with interest... although I'm not really
> familiar with the area. At the risk of drifting further off topic I
> wondered if anyone could recommend an accessible review of these kind
> of dimensionality reduction techniques... I am familiar with PCA and
> know of diffusion maps and ICA and others, but I'd never heard of EM
> and I don't really have any idea how they relate to each other and
> which might be better for one job or the other... so some sort of
> primer would be really handy.
>   

I think the biggest problem is the 'babel tower' aspect of machine
learning (the expression is from David H. Wolpert I believe), and
practitioners in different subfields often use totally different words
for more or less the same concepts (and many keep being rediscovered).
For example, what ML people call PCA is called Karhunen Loéve in signal
processing, and the concepts are quite similar.

Anyway, the book from Bishop is a pretty good reference by one of the
leading researcher:

http://research.microsoft.com/en-us/um/people/cmbishop/prml/

It can be read without much background besides basic 1st year
calculus/linear algebra.

cheers,

David