Perl-Python-a-Day: Sorting

Xah Lee xah at xahlee.org
Mon Oct 10 05:51:37 EDT 2005


Sort a List

Xah Lee, 200510

In this page, we show how to sort a list in Python & Perl and also
discuss some math of sort.

To sort a list in Python, use the “sort” method. For example:

li=[1,9,2,3];
li.sort();
print li;


Note that sort is a method, and the list is changed in place.

Suppose you have a matrix, and you want to sort by second column.
Example Solution:

li=[[2,6],[1,3],[5,4]]
li.sort(lambda x, y: cmp(x[1],y[1]))
print li; # prints [[1, 3], [5, 4], [2, 6]]


The line “li.sort(lambda x, y: cmp(x[1],y[1]))” can also be written
as “li.sort(cmp=lambda x, y: cmp(x[1],y[1]))”

The argument to sort is a function of two arguments, that returns -1,
0, 1. This function is a decision function that tells sort() how to
decide the order of any two elements in your list. If the first
argument is “less” then second argument, the function should return
-1. If equal, then 0. Else, 1.

Here's a more complex example. Suppose you have a list of strings.

'my283.jpg'
'my23i.jpg'
'web7-s.jpg'
'fris88large.jpg'
...


You want to sort them by the number embedded in them. What you have to
do, is to provide sort() method a function, that takes two strings, and
compares the integer inside the string. Here's the solution:

li=[
'my283.jpg',
'my23i.jpg',
'web7-s.jpg',
'fris88large.jpg',
]

def myComp (x,y):
    import re
    def getNum(str): return float(re.findall(r'\d+',str)[0])
    return cmp(getNum(x),getNum(y))

li.sort(myComp)
print li # returns ['web7-s.jpg', 'my23i.jpg', 'fris88large.jpg',
'my283.jpg']


Here, we defined a function myComp to tell sort about the ordering.
Normally, one would use the “lambda” construct, but Python's lambda
construct can only represent the simplest functions.

Some Math about Sorting

In general, the function f used to determine the order of any two
element must satisfy some constraints:

• f(a,a)==0
• if f(a,b)==0 then f(b,a)==0
• if f(a,b)==0 and f(b,c)==0, then f(a,c)==0.
• if f(a,b)==-1 and f(b,c)==-1, then f(a,c)==-1.
• if f(a,b)==-1, then f(b,a)==1.


If the comparison function does not behave as the above, then it is not
consistent, meaning that the result “ordered” list is may actually
be different depending how the language happens to implement sort.

The significance of all these is that in real software you may want to
sort a list of non-simple entities by a specialized ordering. For
example, you may want to sort a list of polygonal surfaces in 3D space,
for particular reasons in implementing some computer graphics features.
Say, you want to sort these polygons by their spacial orientations. It
is in advanced cases like these, understanding the basic math about
ordering is important. Otherwise, you might have a bewildering result
yet unable to locate any flaws in your code.

Python's “sort” method's optional parameters: “key” and
“reverse”

Most of the time, sorting is done for a list of atomic element such as
[3,2,4]. This is simply done by myList.sort() without any argument.
Other than simple list, sort is frequently used on matrixes (e.g.
[[2,6],[1,3],[5,4]]). For matrixes, almost always a particular column
is used for the basis of ordering. For example, if we want to sort by
second column, we do: “li.sort(lambda x, y: cmp(x[1],y[1]))”. Since
this is frequently used, Python provides a somewhat shorter syntax for
it, by specifying the column used as the ordering “key”. For
example:

li=[[2,6],[1,3],[5,4]]
li.sort(key=lambda x:x[1] ) # is equivalent to the following
#li.sort(lambda x, y: cmp(x[1],y[1]))
print li; # prints [[1, 3], [5, 4], [2, 6]]


Because Python's implementation is not very refined , this specialized
syntax is actually much speedier than the general form “lambda x, y:
cmp(x[1],y[1])”. It is a burden on the programer to always use the
“key” syntax idiosyncrasy if he is sorting a large matrix.

Another idiosyncratic provision is the optional “reverse” argument.
This parameter is somewhat necessary when using the “key”
parameter. One can reverse the ordering by using the “reverse”
keyword as a argument to sort. Example:

The following are equivalent:

li.sort(key=lambda x:x[1], reverse=True )
li.sort(lambda x, y: cmp(x[1],y[1]), reverse=True)
li.sort(lambda x, y: cmp(y[1],x[1]))


The official doc on Python's sort method is at (bottom):
http://python.org/doc/2.4/lib/typesseq-mutable.html

Sorting in Perl

(to be posted in a couple of days)

This post is archived at:
http://xahlee.org/perl-python/sort_list.html

 Xah
 xah at xahlee.orghttp://xahlee.org/




More information about the Python-list mailing list