UNIX-style sort in Python?

Alex Martelli aleaxit at yahoo.com
Mon Oct 18 03:27:44 EDT 2004


Andrew Dalke <adalke at mindspring.com> wrote:

> Kotlin Sam wrote:
> >   % sort -t, +2 +5 imputfilename <return>
> 
> >   So, is there a module or function already available that does this?
> 
> In newer Pythons (CVS and beta-1 for 2.4) you can do
> 
> def get_fields(line):
>    fields = line.split("\t")
>    return fields[1], fields[4]
> 
> sorted_lines = sorted(open("imputfilename"), key=get_fields)

Quite right -- and, of course, if Katlin needs get_fields to depend on
the sys.argv parameters that's easy to arrange.


> For older Pythons you'll need to do the "decorate-sort-undecorate"
> ("DSU") yourself, like this
> 
> lines = [get_fields(line), line for line in open("imputfilename")]

Wrong syntax -- needs to be:

lines = [(get_fields(line), line) for line in open("imputfilename")]

> lines.sort()
> sorted_lines = [x[1] for x in lines]
> 
> There is a slight difference between these two.  If fields[1]
> and fields[4] are the same between two lines in the comparison
> then the first of these sorts by position of each line (it's
> a "stable sort") while the latter sorts by the content of the
> line.

...and to get exactly the same stable-sort semantics in 2.3, just change
the first one of the three statements to:

lines = [ (get_fields(line), i, line)
          for i, line in enumerate(open("imputfilename")) ]


Alex



More information about the Python-list mailing list