[SciPy-dev] Fast (O(n log(n)) ) implementation of Kendall Tau

Charles R Harris charlesr.harris at gmail.com
Thu Oct 1 02:22:38 EDT 2009


On Wed, Sep 30, 2009 at 10:07 PM, Enzo Michelangeli <enzomich at gmail.com>wrote:

> From: "Sturla Molden" <sturla at molden.no>
> Sent: Thursday, October 01, 2009 6:09 AM
>
> > Enzo Michelangeli skrev:
> >> Dear all,
> >>
> >> A few days ago I posted to http://projects.scipy.org/scipy/ticket/999 a
> >> drop-in replacement for scipy.stats.kendalltau.
> > Why do you re-implement mergesort in pure Python?
> >
> > ndarrays have a sort method that can use mergesort.
> >
> > Python lists has the same (timsort is mergesort on steroids).
>
> Because Knight's algorithm needs to count the number of swaps (or, to be
> more precise, the number of swaps that would be performed by an equivalent
> bubblesort). In the code, that's the purpose of the variable exchcnt .
>
> An alternative algorithm for the Kendall Tau developed by David
> Christensen,
> and described at http://www.springerlink.com/content/p33qu44058984082/(with
> a Delphi implementation), replaces the mergesort step with one based on
> balanced binary trees (AVL in his case, but I guess RBT would also work).
> Unfortunately, neither the standard Python library nor NumPy/SciPy appear
> to
> implement such useful data structures, and what is available either doesn't
>

Yep, we could use some more of those useful data structures. A "computer
science" library would be useful. The scipy.spacial library is step in that
direction but it would be nice if there were more such.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20091001/55fa6452/attachment.html>


More information about the SciPy-Dev mailing list