Q: sort's key and cmp parameters

Raymond Hettinger python at rcn.com
Mon Oct 5 21:48:59 EDT 2009


> > >  Say you want to change the numeric comparisons so that
> > > even numbers always sort before odd numbers, ie.
> > >     -4 < -2 < 0 < 2 < 4 < ... < -999 < ... -1 < 1 < 3 ...
>
> > This is too easy:
>
> >     >>> s = [-2, 4, 2, -1, -3, 1, -4, 0, 3]
> >     >>> s.sort()
> >     >>> s.sort(key=lambda x: x%2)
> >     >>> s
> >     [-4, -2, 0, 2, 4, -3, -1, 1, 3]
>
> I don't think so:
>
>     >>> s = [0, 2, -2, 3, 1, -1, 4, -4, -3]
>     >>> s.sort(key=lambda x:x%2)
>     >>> s
>     [0, 2, -2, 4, -4, 3, 1, -1, -3]

There were two sorts in my post and only one in yours.
That's why you didn't get the same answer.


> s.sort(key=lambda x:(x%2,x)) might work but why are you so opposed
> to being able to write one's natural thought pattern if that pattern
> happens to be a comparison function?  Who wants to concoct these
> ad-hoc encodings for every ordering?

Your (x%2, x) idea seems like a good way to do it in a single pass.
Also, it seems to be a natural translation of the problem statement,
"even numbers always sort before odd numbers", into a primary key
and secondary sort key.


> How about this: gen_e, gen_pi, gen_sqrt2, etc. all generate
> representations of various real numbers as possibly infinite sequences
> of digits:
>
>    def gen_pi():
>       yield 3
>       yield 1
>       yield 4
>       # ... compute infinite stream one digit at a time
>
> Comparison is obvious if (hmmm....) you can guarantee that you're
> sorting unique elements and that the sort function won't compare an
> element with itself (otherwise I guess you could cut off comparison at
> a million digits or whatever):
>
>    def cmp(gen1, gen2):
>      for d,e in izip(gen1(), gen2()):
>         c = cmp(d,e)
>         if c: return c
>
> I don't see any clean way to handle that with key=, though of course
> maybe I'm missing something.  

Right.  I would be forced to use a Cmp2Key wrapper for this one.

If sorting lazily evaluated infinite sequences is a common use
case for you, then you're going to love Haskell ;-)


> I don't hate sorting in py3.x.

That's good to hear.


>  I understand the concept that py3.x
> introduces incompatibilities to make things better.  What I don't like
> is the approach of breaking stuff gratutiously with no benefit.  If
> the positional arg for cmp is a kludge, why not switch it to a keyword
> arg instead of eliminating it?

When Guido made the final call, I'm sure he was balancing
a number of goals including language simplification and
one-way-to-do-it.  I'm also sure that sorting lazily
evaluated infinite sequences was not on his radar screen :-)


>  key= is very useful but it is
> a derived operation, not the other way around.

As you showed with infinite sequence example, it may well
be the case that cmp is more general and can more readily
handle an exotic case.

Psychologically, the thing that I find to be interesting is
that beginners and intermediate users seem to take to key
functions more readily than old timers.  The key function seems
to be an easy thing to teach (perhaps because that's the
way Excel sorts and the way SQL sorts, or perhaps it is because
problem statements seem to arise in the form of "sort by this,
then by that" instead of "here's how two objects should be
compared").

In contrast, some people who have have had deep experience with
cmp functions may tend to first think of cmp solutions and then
have a harder time seeing solutions with key functions.  If you
grew-up on C's qsort() like I did, then a key function may not
be the first thing that pops into your head.

One other interesting data point:  Python uses key functions
in min(), max(), heapq.nsmallest(), heapq.nlargest, and
itertools.groupby().  Those functions never supported a
cmp function argument, nor has one ever been requested.


Raymond



More information about the Python-list mailing list