pythonic way to sort

Robert Kern robert.kern at gmail.com
Thu May 4 00:36:34 EDT 2006


micklee74 at hotmail.com wrote:
> hi
> I have a file with columns delimited by '~' like this:
> 
> 1SOME STRING      ~ABC~12311232432D~20060401~00000000
> 2SOME STRING      ~DEF~13534534543C~20060401~00000000
> 3SOME STRING      ~ACD~14353453554G~20060401~00000000
> 
> .....
> 
> What is the pythonic way to sort this type of structured text file?
> Say i want to sort by 2nd column , ie ABC, ACD,DEF ? so that it becomes
> 
> 1SOME STRING      ~ABC~12311232432D~20060401~00000000
> 3SOME STRING      ~ACD~14353453554G~20060401~00000000
> 2SOME STRING      ~DEF~13534534543C~20060401~00000000
> ?
> I know for a start, that i have to split on '~', then append all the
> second columns into a list, then sort the list using sort(), but i am
> stuck with how to get the rest of the corresponding columns after the
> sort....

In Python 2.4 and up, you can use the key= keyword to list.sort(). E.g.

In [2]: text = """1SOME STRING      ~ABC~12311232432D~20060401~00000000
   ...: 2SOME STRING      ~DEF~13534534543C~20060401~00000000
   ...: 3SOME STRING      ~ACD~14353453554G~20060401~00000000"""

In [3]: lines = text.split('\n')

In [4]: lines
Out[4]:
['1SOME STRING      ~ABC~12311232432D~20060401~00000000',
 '2SOME STRING      ~DEF~13534534543C~20060401~00000000',
 '3SOME STRING      ~ACD~14353453554G~20060401~00000000']

In [5]: lines.sort(key=lambda x: x.split('~')[1])

In [6]: lines
Out[6]:
['1SOME STRING      ~ABC~12311232432D~20060401~00000000',
 '3SOME STRING      ~ACD~14353453554G~20060401~00000000',
 '2SOME STRING      ~DEF~13534534543C~20060401~00000000']

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco




More information about the Python-list mailing list