[SciPy-Dev] scipy.spatial comments

Gael Varoquaux gael.varoquaux at normalesup.org
Sun Mar 11 04:47:52 EDT 2012


On Sat, Mar 10, 2012 at 08:35:51PM -0500, David Warde-Farley wrote:
> On 2012-03-10, at 4:53 AM, Ralf Gommers wrote:

> > Second, squared Euclidean distance is computed by taking the square of
> > the Euclidean distance. I think it would make more sense to do it the
> > other way around: the Euclidean distance is computed by taking the
> > square root of the squared Euclidean distance.

> > Makes sense, should be a little faster.

> Actually, the current implementation is absolutely crazy, especially
> considering that SciPy has easy access to BLAS. One should never be
> computing Euclidean distances naively like is done in distance.c.

Actually, I think that we had this discussion a while ago on the
scikit-learn mailing list and it depends on the dimensionality of your
feature space. For a high-dimensional feature space, you are much better
off computing euclidean distance as you suggest, with the dot product.
However, I think that for a low-dimensional feature space (say 3D),
scipy's current approach is better.

I can't really compare, because on my laptop I must have a crap BLAS, as
the dot product approach is only slighlty faster than cdist with your
example.

Gaël



More information about the SciPy-Dev mailing list