Sorting with only a partial order definition

Lasse Vågsæther Karlsen lasse at vkarlsen.no
Thu Oct 27 04:57:26 EDT 2005


I have a list of items and a "rule" for ordering them.

Unfortunately, the rule is not complete so it won't define the correct 
order for any two items in that list.

In other words, if I pick two random items from the list I may or may 
not have a rule that dictates the order of those two items. The rule 
could be "implicit" in that I got rules for other items, for instance:

[a, b, c, d]
a < b
b < c

If I now pick a and c out of the list I would not know wether a or c 
comes first, unless I grab the rules for a<b and b<c and imply that a<c 
from those two.

As such, there could be potentially many correct results from ordering 
such a list. (d is unspecified above so any position for d is actually 
legal)

Is there anything in Python that will help me sort a list with something 
like this ?

For instance, given the following list of items:

items = ["a", "b", "c", "d"]

and the following two rules:

"a" comes before "d"
"d" comes before "b"

then the following is a list of correct results (could be more though):

["a", "d", "b", "c"]
["a", "c", "d", "b"]
["a", "d", "c", "b"]
...

Note that this is an arbitrary example. The real list is a list of lists 
containing database rows, ie. something like this:

[[10, "Column 2", "Column 3"],
  [15, "Column 2", "Column 3"],
  ...]

If there isn't anything built in, does anyone have an idea about how to 
go about creating such an ordering function ? Tweaking a cmp-like 
function to return 0 for "undefined" didn't seem to work so there must 
be a slightly more intelligent solution than that. Perhaps the rules 
would have to be checked in a specific order.

Doing a combinatorial solution and checking the rules aren't really an 
option as on last count there was close to 20K rows.

There's also a lot of rules (also coming from a database). I was 
thinking I could do a sort based on one of the rules and use a stable 
sort for the following rules but doing that sometimes rearranged the 
list so that it was now incorrect going by the previous rules.

Here's my initial test for doing a cmp-like solution:

order = set([("a", "d"), ("d", "b")])
def my_cmp(a, b):
     if (a, b) in order:
         print a, "<", b
         return -1
     elif (b, a) in order:
         print a, "<", b
         return +1
     else:
         print a, "?", b
         return 0

items = ["c", "b", "a", "d"]
items.sort(cmp=my_cmp)
print items

This prints out:

b ? c
a ? b
d < a
['c', 'b', 'a', 'd']

Which is obviously incorrect.

Any help or pointers would be most welcome.

-- 
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:lasse at vkarlsen.no
PGP KeyID: 0x2A42A1C2



More information about the Python-list mailing list