[Numpy-discussion] distance matrix and (weighted) p-norm

Damian Eads eads at soe.ucsc.edu
Mon Sep 8 11:15:24 EDT 2008


Emanuele Olivetti wrote:
> Hi,
> 
> I'm trying to compute the distance matrix (weighted p-norm [*])
> between two sets of vectors (data1 and data2). Example:
> 
> import numpy as N
> p = 3.0
> data1 = N.random.randn(100,20)
> data2 = N.random.randn(80,20)
> weight = N.random.rand(20)
> distance_matrix = N.zeros((data1.shape[0],data2.shape[0]))
> for d in range(data1.shape[1]):
>     distance_matrix +=
> (N.abs(N.subtract.outer(data1[:,d],data2[:,d]))*weight[d])**p
>     pass
> distance_matrix = distance_matrix**(1.0/p)
> 
> 
> Is there a way to speed up the for loop? When the dimension
> of the vectors becomes big (e.g. >1000) the for loop
> becomes really annoying.
> 
> Thanks,
> 
> Emanuele
> 
> [*] : ||x - x'||_w = (\sum_{i=1...N} (w_i*|x_i - x'_i|)**p)**(1/p)

This feature could be implemented easily. However, I must admit I'm not 
very familiar with weighted p-norms.  What is the reason for raising w 
to the p instead of w_i*(|x_i-x'_i|)**p?

Damian



More information about the NumPy-Discussion mailing list