Newbie with sort text file question

Bengt Richter bokr at oz.net
Mon Jul 14 19:01:43 EDT 2003


On 13 Jul 2003 15:30:58 -0700, stuart_clemons at us.ibm.com (stuartc) wrote:

>Hi Bengt:
>
>Thank you. Your code worked perfectly based on the text file I
>provided.
>
>Unfortunately for me, my real text file has one slight variation that
>I did not account for.  That is, the fruit name does not always have
>an "_" after its name.  For example, apple below does not an an "_"
>attached to it.
>
>banana_c \\yellow
>apple   \\green
>orange_b \\yellow
>
Ok, try the changes below, to do same thing, but with re:

<snip>
>> 
>> ===< stuartc.py >========================================================
>> import StringIO
>> textf = StringIO.StringIO(r"""
>> banana_c \\yellow
>> apple_a \\green
>> orange_b \\yellow
>> banana_d \\green
>> orange_a \\orange
>> apple_w \\yellow
>> banana_e \\green
>> orange_x \\yellow
>> orange_y \\orange
>> """)
>> 
>> # I would like two output files:
>> # (actually two files ?? Ok)
>> 
>> # 1) Sorted like this, by the fruit name (the name before the dash)
>>
   import re
   rxo = re.compile(r'^([A-Za-z]+)(.*)$')
   #XXX#fruitlist = [line.split('_',1) for line in textf if line.strip()]
   fruitlist = [rxo.search(line).groups() for line in textf if line.strip()]
>> fruitlist.sort()
>> 
>> # apple_a \\green
>> # apple_w \\yellow
>> # banana_c \\yellow
>> # banana_d \\green
>> # banana_e \\green
>> # orange_a \\orange
>> # orange_b \\yellow
>> # orange_x \\yellow
>> # orange_y \\orange
>> 
>> outfile_1 = StringIO.StringIO()
   #XXX# >> outfile_1.write(''.join(['_'.join(pair) for pair in fruitlist]))
   outfile_1.write('\n'.join([''.join(pair) for pair in fruitlist]+['']))
>> 
>> # 2) Then summarized like this, ordered with the highest occurances
>> # first:

HTH

Regards,
Bengt Richter




More information about the Python-list mailing list