sorting data
Ricardo Aráoz
ricaraoz at gmail.com
Mon Oct 29 12:32:55 EDT 2007
Beema shafreen wrote:
> hi all,
> I have problem to sort the data.. the file includes data as
> follow.
> file:
> chrX: 123343123 123343182 A_16_P41787782
> chrX: 123343417 123343476 A_16_P03762840
> chrX: 123343460 123343519 A_16_P41787783
> chrX: 12334336 12334395 A_16_P03655927
> chrX: 123343756 123343815 A_16_P03762841
> chrX: 123343807 123343866 A_16_P41787784
> chrX: 123343966 123344024 A_16_P21578670
> chrX: 123344059 123344118 A_16_P21578671
> chrX: 12334438 12334497 A_16_P21384637
> chrX: 123344776 123344828 A_16_P21578672
> chrX: 123344811 123344870 A_16_P03762842
> chrX: 123345165 123345224 A_16_P41787789
> chrX: 123345360 123345419 A_16_P41787790
> chrX: 123345380 123345439 A_16_P03762843
> chrX: 123345481 123345540 A_16_P41787792
> chrX: 123345873 123345928 A_16_P41787793
> chrX: 123345891 123345950 A_16_P03762844
>
>
> how do is sort the file based on the column 1 and 2 with values......
> using sort option works for only one column and not for the other how do
> is sort both 1 and 2nd column so that the third column does not change.....
> my script:#sorting the file
> start_lis = []
> end_lis = []
> fh = open('chromosome_location_346010.bed','r')
> for line in fh.readlines():
> data = line.strip().split('\t')
> start = data[1].strip()
> end = data[2].strip()
> probe_id = data[3].strip()
> start_lis.append(start)
> end_lis.append(end)
> start_lis.sort()
> end_lis.sort()
> for k in start_lis:
> for i in end_lis
> print k , i , probe_id(this doesnot worK)
> result = start#end#probe_id ------->this doesnot work...
> print result
>
> What is the error and how do is sort a file based on the two column to
> get the fourth column also with that.
> regards
> shafreen
>
Don't know if this is what you are looking for :
dataList = []
for line in open('chromosome_location_346010.bed','r') :
data = line.strip().split('\t')
start = data[1].strip()
end = data[2].strip()
probe_id = data[3].strip()
dataList.append((start, end, probe_id))
dataList.sort(key=lambda x: x[1].rjust(20) + x[2].rjust(20))
for item in dataList:
print 'Start :', item[0].rjust(11) \
, ' - End :', item[1].rjust(11) \
, ' - Probe :', item[2]
More information about the Python-list
mailing list