UNIX-style sort in Python?
Alex Martelli
aleaxit at yahoo.com
Mon Oct 18 03:27:44 EDT 2004
Andrew Dalke <adalke at mindspring.com> wrote:
> Kotlin Sam wrote:
> > % sort -t, +2 +5 imputfilename <return>
>
> > So, is there a module or function already available that does this?
>
> In newer Pythons (CVS and beta-1 for 2.4) you can do
>
> def get_fields(line):
> fields = line.split("\t")
> return fields[1], fields[4]
>
> sorted_lines = sorted(open("imputfilename"), key=get_fields)
Quite right -- and, of course, if Katlin needs get_fields to depend on
the sys.argv parameters that's easy to arrange.
> For older Pythons you'll need to do the "decorate-sort-undecorate"
> ("DSU") yourself, like this
>
> lines = [get_fields(line), line for line in open("imputfilename")]
Wrong syntax -- needs to be:
lines = [(get_fields(line), line) for line in open("imputfilename")]
> lines.sort()
> sorted_lines = [x[1] for x in lines]
>
> There is a slight difference between these two. If fields[1]
> and fields[4] are the same between two lines in the comparison
> then the first of these sorts by position of each line (it's
> a "stable sort") while the latter sorts by the content of the
> line.
...and to get exactly the same stable-sort semantics in 2.3, just change
the first one of the three statements to:
lines = [ (get_fields(line), i, line)
for i, line in enumerate(open("imputfilename")) ]
Alex
More information about the Python-list
mailing list