"Natural" use of cmp= in sort

Tue Nov 11 01:36:10 EST 2014

On Mon, Nov 10, 2014 at 8:09 PM, Paddy <paddy3118 at gmail.com> wrote:
> On Monday, 10 November 2014 18:45:15 UTC, Paddy  wrote:
>> Hi, I do agree with                                                                          Raymond H. about the relative merits of cmp= and key= in sort/sorted, but I decided to also not let natural uses of cmp= pass silently.
>>
>> In answering this question, http://stackoverflow.com/a/26850434/10562 about ordering subject to inequalities it seemed natural to use the cmp= argument of sort rather than key=.
>>
>> The question is about merging given inequalities to make 1 inequality such that the inequalities also stays true.
>>
>>
>
> Thanks Peter, Ian. I have modified my code to expand transitive relations and ask you to view it on stackoverflow via the original link (as posting code on newsgroups is an ugly hack).

You still run into trouble though if the given inequalities don't
provide enough information for a total ordering. E.g.:

>>> ' > '.join(extract_relations("""f4 > f1
... f2 > f3"""))
'f1 > f2 > f3 > f4'

By adding some debugging prints, we can see what cmp calls were made
by the sort routine and what the results were:

cmp('f2', 'f1') -> 1
cmp('f3', 'f2') -> 1
cmp('f4', 'f3') -> 1

There is no information about the relative order of f2 and f1, so the
cmp function just returns 1 there.
f2 is known to be greater than f3, so that call correctly returns 1.
There is again no information about the relative order of f4 and f3,
so it again just returns 1. However, this is inconsistent with the
first comparison that placed f1 > f2, because it implies that f1 > f4.

As you can see, giving an inconsistent cmp function to sort produces
bogus results. If you only have a partial ordering of the inputs, you
need to make sure that the cmp function you provide is consistent with
*some* total ordering.

Another issue is that your expand_transitive_relations function is I
think O(n**3 log n), which looks unattractive compared to the O(n**2)
topological sort given in the other answers. Another advantage of the
topological sort is that it will detect if the graph is cyclic (i.e.
the input data itself is inconsistent), rather than just return a
bogus output.

> My main reason for the post to c.l.p remains though; it seems like a *natural* use of the cmp= comparator function to sorted rather than using key= .

There are cases where a cmp function is more natural than a key
function, but for these we have the functools.cmp_to_key adapter.