efficient partial sort in Python ?

Ian Kelly ian.g.kelly at gmail.com
Mon Aug 18 14:48:04 EDT 2014


On Mon, Aug 18, 2014 at 11:18 AM, Chiu Hsiang Hsu <wdv4758h at gmail.com> wrote:
> I know that Python use Timsort as default sorting algorithm and it is efficient,
> but I just wanna have a partial sorting (n-largest/smallest elements).
>
> In the current state, I can only find this kind of functions from heapq,
> but it's too slow for this problem,
> the pure c sorted function can easily beat heapq.nlargest even though it sort all the elements.
>
> I think it would be better if we have an effeicient partial sorting function,
> maybe it could be like this : partial_sorted(data, n), data.partial_sort(n)
>
> Does anyone have any more information for partial sorting ?
>
> Btw, C++ has a partial_sort function in <algorithm>, but I don't know what algorithm it really use.

According to http://www.sgi.com/tech/stl/partial_sort.html the STL
implementation also uses heapsort.

For better partial sorting algorithms, try Wikipedia:
http://en.wikipedia.org/wiki/Partial_sorting

However, I think you're still not likely to beat Timsort for smallish
data sets unless you also implement it in C.



More information about the Python-list mailing list