"Natural" use of cmp= in sort

Tue Nov 11 02:44:56 EST 2014

On Tuesday, 11 November 2014 06:37:18 UTC, Ian  wrote:
> On Mon, Nov 10, 2014 at 8:09 PM, Paddy <paddyxxx-at-xmail.com> wrote:
> > On Monday, 10 November 2014 18:45:15 UTC, Paddy  wrote:
> >> Hi, I do agree with                                                                          Raymond H. about the relative merits of cmp= and key= in sort/sorted, but I decided to also not let natural uses of cmp= pass silently.
> >>
> >> In answering this question, http://stackoverflow.com/a/26850434/10562 about ordering subject to inequalities it seemed natural to use the cmp= argument of sort rather than key=.
> >>
> >> The question is about merging given inequalities to make 1 inequality such that the inequalities also stays true.
> >>
> >>
> >
> > Thanks Peter, Ian. I have modified my code to expand transitive relations and ask you to view it on stackoverflow via the original link (as posting code on newsgroups is an ugly hack).
> 
> You still run into trouble though if the given inequalities don't
> provide enough information for a total ordering. E.g.:
> 
> >>> ' > '.join(extract_relations("""f4 > f1
> ... f2 > f3"""))
> 'f1 > f2 > f3 > f4'
> 
> By adding some debugging prints, we can see what cmp calls were made
> by the sort routine and what the results were:
> 
> cmp('f2', 'f1') -> 1
> cmp('f3', 'f2') -> 1
> cmp('f4', 'f3') -> 1
> 
> There is no information about the relative order of f2 and f1, so the
> cmp function just returns 1 there.
> f2 is known to be greater than f3, so that call correctly returns 1.
> There is again no information about the relative order of f4 and f3,
> so it again just returns 1. However, this is inconsistent with the
> first comparison that placed f1 > f2, because it implies that f1 > f4.
> 
> As you can see, giving an inconsistent cmp function to sort produces
> bogus results. If you only have a partial ordering of the inputs, you
> need to make sure that the cmp function you provide is consistent with
> *some* total ordering.
> 
> Another issue is that your expand_transitive_relations function is I
> think O(n**3 log n), which looks unattractive compared to the O(n**2)
> topological sort given in the other answers. Another advantage of the
> topological sort is that it will detect if the graph is cyclic (i.e.
> the input data itself is inconsistent), rather than just return a
> bogus output.
> 
> > My main reason for the post to c.l.p remains though; it seems like a *natural* use of the cmp= comparator function to sorted rather than using key= .
> 
> There are cases where a cmp function is more natural than a key
> function, but for these we have the functools.cmp_to_key adapter.

Thanks Ian. The original author states "...and it is sure that the given inputs will give an output, i.e., the inputs will always be valid.", which could be taken as meaning that all inputs are sufficient, well formed, and contain all relations as their first example does.

In that case, expand_transitive_relations is not even needed. Lets say it isn't for the sake of argument, then we are left with the direct use of cmp= versus a conversion to a key= function.

It seems to me that *in this case* the cmp= function naturally flows from the solution algorithm and that cmp_to_key is less so.

Yes, I knew that there are cases where a cmp function is more natural than key; the idea is to squirrel out a few. We have already made the, (well reasoned in my opinion), decision to go down the key= route in Python 3. I also like to track where my algorithms might originally map to cmp=. (It is not often).

My only other case of this type is here: http://stackoverflow.com/questions/15797120/can-this-cmp-function-be-better-written-as-a-key-for-sorted.