[Numpy-discussion] distance_matrix: how to speed up?

Vincent Schut schut at sarvision.nl
Thu May 22 03:45:32 EDT 2008


Emanuele Olivetti wrote:
<snip>
> 
> This solution is super-fast, stable and use little memory.
> It is based on the fact that:
> (x-y)^2*w = x*x*w - 2*x*y*w + y*y*w
> 
> For size1=size2=dimensions=1000 requires ~0.6sec. to compute
> on my dual core duo. It is 2 order of magnitude faster than my
> previous solution, but 1-2 order of magnitude slower than using
> C with weave.inline.
> 
> Definitely good enough for me.
> 
> 
> Emanuele

Reading this thread, I remembered having tried scipy's sandbox.rbf 
(radial basis function) to interpolate a pretty large, multidimensional 
dataset, to fill in the missing data points. This however failed soon 
with out-of-memory errors, which, if I remember correctly, came from the 
pretty straightforward distance calculation between the different data 
points that is used in this package. Being no math wonder, I assumed 
that there simply was no simple way to calculate distances without using 
much memory, and ended my rbf experiments.

To make a story short: correct me if I am wrong, but might it be an idea 
to use the above solution in scipy.sandbox.rbf?

Vincent.




More information about the NumPy-Discussion mailing list