[issue34561] Replace list sorting merge_collapse()?

Tim Peters report at bugs.python.org
Mon Sep 3 15:08:12 EDT 2018


Tim Peters <tim at python.org> added the comment:

Looks like all sorts of academics are exercised over the run-merging order now.  Here's a paper that's unhappy because timsort's strategy, and 2-merge too, aren't always near-optimal with respect to the entropy of the distribution of natural run lengths:

"Nearly-Optimal Mergesorts:  Fast, Practical Sorting Methods That Optimally Adapt to Existing Runs"
J. Ian Munro
Sebastian Wild
https://arxiv.org/abs/1805.04154

Their alternatives are worth looking at too.  Offhand, their "powersort" ordering looks more promising to me than their "peeksort".  It was very deliberate that timsort looks at whether to merge a run immediately after identifying one, for the "freshness in memory hierarchy" reason.  That peeksort does not is, I expect, more significant in real life than the "one little blemish" they call it.

In any case, don't flip out over this.  On randomly ordered input, timsort already strongly tends toward perfectly balanced merges, and there's no significant improvement to be had over that.  On the other hand, on inputs with significant order that _can_ be exploited by this approach, the gains are far more due to "galloping" than to the order in which runs are merged.  All these papers ignore galloping entirely.  Example:  take range(1_000_000), cut it in half, and riffle shuffle it.  So you get

    0 500000 1 500001 ... 499999 999999 or
    500000 0 500001 1 ... 999999 499999
    
Either way, there are half a million natural runs, each of length 2.  Any balanced way of merging them is as good as it gets.  It's galloping alone that buys huge improvements in cases like this.

Especially in the context of Python, where object comparisons are (in general) far more expensive than moving pointers around, some excess in worst-case memory copying is, well, "one little blemish" ;-)

But still worth addressing if feasible.  Now that sorting often also adapts to avoid the relatively massive expense of PyObject_RichCompareBool, memory-movement costs become more important.  Approaches like 2-merge also give simpler merge_collapse() code that's easier to reason about, and reduces the worst-case stack size (which becomes very easy to compute in advance instead of the current puzzle).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34561>
_______________________________________


More information about the Python-bugs-list mailing list