Large Dictionaries

Mon Jun 5 10:52:22 EDT 2006

[Scott David Daniels]
>> For example, time timsort (Python's internal sort) on pre-sorted
>> data; you'll find it is handled faster than random data.

O(N) vs O(N log N), in fact.

[Lawrence D'Oliveiro]
> But isn't that how a reasonable sorting algorithm should behave? Less
> work to do if the data is already sorted?

For example, the O(N log N) heapsort is unreasonable compared to the
O(N**2) bubblesort by that reasoning (pre-sorted is actually a bad
case for heapsort, but a good case for bubblesort)?  O(N log N)
sorting algorithms helped by pre-existing order are uncommon, unless
they do extra work to detect and exploit pre-existing order.

Python's current sorting algorithm exploits pre-existing order of many
kinds.  For example, take a sorted list, "cut it in half, and riffle
shuffle the two halves together"; e.g.,

[0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15]

That turns out to be a supernaturally good case too.