[SciPy-user] Calculating a lot of (squared) Mahalanobis distances

Thu Nov 6 22:44:22 EST 2008

On Thu, Nov 6, 2008 at 21:36, David Warde-Farley <dwf at cs.toronto.edu> wrote:
> Hi folks,
>
> I'm trying to calculate a lot of Mahalanobis distances (in essence,
> applying a positive definite quadratic x.T * A * x to a lot of vectors
> x) and trying to think of the fastest way to do it with numpy.
>
> If I've got a single vector x and a 2D array sigmainv, then I've got
> something like this.
>
> import numpy as np
> ...
> xmmu = x - mu
> dist = np.dot(xmmu, np.dot(sigmainv, xmmu))
>
> However if I've got a DxN 2d array of N different vectors for which I
> want this quantity, it seems I can either use a loop or do something
> like
>
> xmmu = x - mu[:,np.newaxis]
> dist = np.diag(xmmu, np.dot(sigmainv, xmmu)))
>
> It seems like a lot of wasted computation to throw out the off-
> diagonals. One thought I've had would be to diagonalize sigmainv and
> then do something tricky with scalar products and broadcasting the
> diagonal, but I am not sure whether that would save me much.
>
> Does anyone have any other tricks up their sleeve?

(xmmu * np.dot(sigmainv, xmmu)).sum(axis=0)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco