[Python-Dev] Optimizing list.sort() by checking type in advance

Mon Oct 10 11:29:09 EDT 2016

On Mon, Oct 10, 2016 at 5:18 PM, Elliot Gorokhovsky
<elliot.gorokhovsky at gmail.com> wrote:
> First, some simple benchmark results (numbers are seconds, check out the
> extension module at https://github.com/embg/python-fast-listsort.git):
>
> *** 1e3 ints ***
> F.fastsort(): 0.00018930435180664062
> F.sort(): 0.0002830028533935547
> *** 1e3 strings ***
> F.fastsort(): 0.0003533363342285156
> F.sort(): 0.00044846534729003906
> *** 1e7 ints ***
> F.fastsort(): 5.479267358779907
> F.sort(): 8.063318014144897
> *** 1e7 strings ***
> F.fastsort(): 9.992833137512207
> F.sort(): 13.730914115905762
>
> The optimization uses the fact that, in practice, we are almost always
> sorting keys of the same type

(Might be a topic for python-ideas rather than python-dev.)

Can you pick up a standard benchmark suite and use that instead of
these simplistic operations? How much slower are sorts of
heterogeneous lists - what's the impact/cost on those? If a large test
suite ("make test" if you don't have a nice benchmark handy - the
CPython test suite is a solid mass of code) shows an overall slowdown,
this probably isn't worth pursuing; if it shows a minor but
insignificant increase, there might be something worth looking into;
and if it shows a huge increase... well, to be honest, if it shows a
huge increase, I'd suspect a fault in the measurements, because
sorting isn't a significant part of most test suites :)

ChrisA