[Python-Dev] Sorting
Tim Peters
tim.one@comcast.net
Sat, 27 Jul 2002 04:02:48 -0400
http://www.python.org/sf/587076
has collected timings on 5 boxes so far. I also noted that msort() gets a
32% speedup on my box when sorting a 1.33-million line snapshot of the
Python-Dev archive. This is a puzzler to account for, since you wouldn't
think there's significant pre-existing lexicographic order in a file like
that. McIlroy noted similar results from experiments on text, PostScript
and C source files in his adaptive mergesort (which is why I tried sorting
Python-Dev to begin with), but didn't offer a hypothesis.
Performance across platforms is a hoot so far, with Neal's box even seeing a
~6% speedup on *sort.
Skip's Pentium III acts most like my Pentium III, which shouldn't be
surprising. Ours are the only reports where !sort is faster than *sort for
samplesort, and also where ~sort under samplesort is faster than ~sort under
timsort.
~sort (only 4 distinct values, repeated N/4 times) remains the most puzzling
of the tests by far. Relative to its performance under samplesort,
sf userid ~sort speedup under timsort (negative means slower)
--------- ---------------------------------------------------
montanaro -23%
tim_one - 6%
jacobs99 +18%
lemburg +25%
nascheme +30%
Maybe it's a big win for AMD boxes, and a mixed bag for Intel boxes. Or
maybe it's a win for newer boxes, and a loss for older boxes. Or maybe it's
a bigger win the higher the clock rate (it hurt the most on the slowest box,
and helped the most on the fastest). Since it ends up doing a sequence of
perfectly balanced merges from start to finish, I thought perhaps it has to
do with OS and/or chip intelligence in read-ahead cache optimizations -- but
*sort also ends up doing a sequence of perfectly balanced merges, and
doesn't behave at all like ~sort across boxes. ~sort does exercise the
galloping code much more than other tests (*sort rarely gets into galloping
mode; ~sort never gets out of galloping mode), so maybe it really has most
to do with cache design.
Whatever, it's starting to look like a no-brainer -- except for the
extremely mixed ~sort results, the numbers so far are great.