[Tutor] extracting a column from many files

Bala subramanian bala.biophysics at gmail.com
Thu Feb 19 12:58:37 CET 2009


Hi,

I have to extract say column 1, column 2 ..... column 6 (six different
columns) from 10 different input files. The function "extract" to extract
the columns works fine. For each column  extracted from the input files, i
have to write it in one output file. I have to make 6 output files
correspondingly. How should i loop the writing of output files.

Also, you had suggested previously the following way of creating list of row
lists from the list of column lists

rows = map(None, *listOfColumns)

I am not getting how this works.

Thanks,
Bala

On Thu, Feb 19, 2009 at 12:38 PM, Kent Johnson <kent37 at tds.net> wrote:

> On Thu, Feb 19, 2009 at 5:41 AM, Bala subramanian
> <bala.biophysics at gmail.com> wrote:
> > Dear friends,
> >
> > I want to extract certain 6 different columns from a many files and write
> it
> > to 6 separate output files. I took some help from the following link
> >
> > http://mail.python.org/pipermail/tutor/2004-November/033475.html
> >
> > to write one column from many input files to a particular output file.
> Since
> > i have to extract 6 such columns, i wanted to loop the output file
> writing
> > part.
>
> Do you want the resulting files to have a single column, or one column
> per input file? The mail you cite has one column per file.
>
> > This block of the script is shown in bold below. I see some odd output
> > file names.
>
> You are using the string representation of the values as the file
> name! What do you want to call the files?
>
> >  Kindly suggest me i ) how best or should i do this loop part
> > ii) explanation of the working row=map(None,*value) below which i adopted
> > from the above tutor-mail list link.
>
> Please clarify what you want to do first.
> Kent
>
> >
> > Thanks in advance,
> > Bala
> >
> >  #!/usr/bin/env python
> > from sys import argv
> > lst_files=argv[1:]
> >
> > sh=[];st=[];sta=[];buc=[];pro=[];ope=[]
> >
> > def extract(fname):
> >     A=[];B=[];C=[];D=[];E=[];F=[]
> >     data=open(fname).readlines()
> >     for number, line in enumerate(data):
> >         if "  Duplex" and " Shear" in line:
> >             number=number+3
> >         for x in range(0,8):
> >                 new=data[number]
> >                 A.append(new[19:26])
> >                 B.append(new[27:34])
> >                 C.append(new[37:42])
> >                 D.append(new[44:54])
> >                 E.append(new[56:63])
> >                 F.append(new[69:75])
> >                 number = number + 1
> >     sh.append(A)
> >     st.append(B)
> >     sta.append(C)
> >     buc.append(D)
> >     pro.append(E)
> >     ope.append(F)
> >
> > for x in lst_files:
> >           extract(x)
> >
> > list=[sh,st,sta,buc,pro,ope]
> > for value in list:
> >     row=map(None,*value)
> >     out=open(str(value) + '.txt','w')
> >     for num in row:
> >           out.write('\t'.join(num))
> >           out.write('\n')
> >     out.close()
> >
> >
> >
> > _______________________________________________
> > Tutor maillist  -  Tutor at python.org
> > http://mail.python.org/mailman/listinfo/tutor
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20090219/e0293753/attachment.htm>


More information about the Tutor mailing list