Surprising (for me) benchmark results...

Tim Peters tim.one at home.com
Tue May 1 21:00:29 EDT 2001


[E. Mark Ping]
> In Perl and Java, the builtin sort, and in C++ std::sort.  Why would
> you expect this to be best in Python?

[Michael Hudson]
> 'cause Tim Peters wrote the code that sorts lists in Python,
> basically.

I think that's an excellent reason <wink> -- but for believing a Python sort
written by C++, Java or Perl nerds would be slower than the one Python uses.

Python's sort is highly non-obvious, and beat a dozen variations of
quicksort, another dozen of mergesort, and a scattering of more esoteric
ones.  When you're just sorting strings, though, Python endures considerable
work on each comparison to figure out all over from scratch again *how* to
compare two strings.  It's still an O(N log N) method overall, and the
constant factor is boosted by the repeated "how to compare?" problem since
the number of comparisons is of the same order.

So while Python's sort wins by minimizing the total number of comparisons, it
loses by doing more work per comparison.  Especially since rich comparisons
got added, "a comparison" in Python is an all-singing all-dancing entire OO
subsystem of its own.

> ...
> Fair enough.  I wonder where the speed is then?  Might be in creating
> the list/vector before sorting?  The code in
> Objects/fileobject.c:file_readlines seems quite clever, so that might
> be it.

Gets my vote!  I bet the sort doesn't hurt it, though <wink>.

modestly y'rs  - tim





More information about the Python-list mailing list