new sorting algorithm

Mon May 2 05:24:36 EDT 2022

Hi,

> The median-of-three partitioning technique makes that work reasonably
well, so it won't be pathologically slow

Just to be clear because I've wondered but haven't looked into it, we
know naive quicksorting of already-sorted data is pathalogical, but
median-of-3 is known to fix this pathology?

cheers

jan

On 02/05/2022, Chris Angelico <rosuav at gmail.com> wrote:
> On Mon, 2 May 2022 at 09:20, Dan Stromberg <drsalists at gmail.com> wrote:
>>
>>
>> On Sun, May 1, 2022 at 1:44 PM Chris Angelico <rosuav at gmail.com> wrote:
>>>
>>> On Mon, 2 May 2022 at 06:43, Dan Stromberg <drsalists at gmail.com> wrote:
>>> > On Sun, May 1, 2022 at 11:10 AM Chris Angelico <rosuav at gmail.com>
>>> > wrote:
>>> >>
>>> >> On Mon, 2 May 2022 at 01:53, Nas Bayedil <nbayedil at gmail.com> wrote:
>>> >> > We believe that using this method to develop completely new, fast
>>> >> > algorithms, approaching the speed of the famous *QuickSort*, the
>>> >> > speed of
>>> >> > which cannot be surpassed, but its drawback can be circumvented, in
>>> >> > the
>>> >> > sense of stack overflow, on some data.
>>> >>
>>> >> Hmm, actually TimSort *does* exceed the speed of quicksort for a lot
>>> >> of real-world data. For instance, if you take a large sorted list,
>>> >> append a handful of (unsorted) items to it, and then sort the list,
>>> >> TimSort can take advantage of the fact that the bulk of the list is
>>> >> sorted. It ends up significantly faster than re-sorting the entire
>>> >> list.
>>> >
>>> >
>>> > In fact, Timsort is O(n) for already-sorted data, while many quicksorts
>>> > are O(n^2) for already-sorted data.
>>> >
>>> > Quicksort can be salvaged by using a median-of-3 partitioning, but it's
>>> > still O(n^2) in the (less common) worst case.
>>> >
>>>
>>> This is true, but purely sorted data isn't a very practical case. The
>>> case of mostly-sorted data IS practical, though, so it's a quite big
>>> win that it can be close to O(n), and still faster than inserting each
>>> item individually.
>>
>>
>> You seem to be of the impression that nearly-sorted data isn't an uphill
>> battle with a straightforward quicksort.
>>
>> I'm having a hard time convincing myself of that.
>>
>
> The median-of-three partitioning technique makes that work reasonably
> well, so it won't be pathologically slow. It's hardly Quicksort's best
> feature, but it could easily be a lot worse. I'd have to check, but I
> think it still manages to be O(n log n). Merge sort, of course, is a
> lot more consistent, but the asymptotic cost is still broadly the
> same.
>
> But Timsort manages to be close to O(n) for sorted data, reversed
> data, nearly-sorted or nearly-reversed data, etc. Makes it very handy
> for jobs like "click a heading to sort by it", where you might add
> multiple sort keys.
>
> (Plus, Python's implementation has some cool tricks for small
> collections that make it quite efficient.)
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>