[SciPy-dev] Sorting speed

Robert Kern robert.kern at gmail.com
Sat Dec 31 21:06:06 EST 2005


On 12/30/05, Fernando Perez <Fernando.Perez at colorado.edu> wrote:
> Travis Oliphant wrote:
>
> > There are many sorting algorithms in numarray.  I'm not sure which one
> > is being used regularly, but I'd like to see them brought over, though,
> > which shouldn't be too big of a deal and is on the radar.
>
> It may be worth noting that the sort code in python's core, for
> somelist.sort(), is to my understanding, state of the art.  Tim Peters a while
> ago (py2.2, I think) brought the full weight of his considerable algorithmic
> talent to bear on this problem, as well as some recent academic results.  I
> don't know how this compares to what numarray uses, but it may be worth
> investigating/copying.
>
> If Tim's work is as good as it is claimed to be (and I have much respect for
> his claims), it might be a good idea to use a stripped version of his code
> which doesn't have to deal with the generalities of python lists (and all the
> per-item typechecking needed there).

My (limited) understanding is that timsort is specifically optimized
for sorting Python lists as they are generally used. Thus, compare
operations are minimized because comparison operations are potentially
expensive for general Python objects where they are relatively very
cheap for numeric arrays. Also, timsort scans for runs, so sorting a
list that is entirely sorted except for the last item is quite fast.
Since one can append to lists, this is very useful for lists, but one
cannot append to arrays so it would be less useful to us. The benefits
of adopting timsort are probably less than they were for Python lists.

--
Robert Kern
robert.kern at gmail.com




More information about the SciPy-Dev mailing list