concatenate fasta file

Roy Smith roy at panix.com
Fri Feb 12 11:23:54 EST 2010


In article 
<62a50def-e391-4585-9a23-fb91f2e2edc8 at b9g2000pri.googlegroups.com>,
 PeroMHC <macmanes at gmail.com> wrote:

> Hi All, I have  a simple problem that I hope somebody can help with. I
> have an input file (a fasta file) that I need to edit..
> 
> Input file format
> 
> >name 1
> tactcatacatac
> >name 2
> acggtggcat
> >name 3
> gggtaccacgtt
> 
> I need to concatenate the sequences.. make them look like
> 
> >concatenated
> tactcatacatacacggtggcatgggtaccacgtt
> 
> thanks. Matt

Some quick ideas.  First, try something along the lines of (not tested):

data=[]
for line in sys.stdin:
   if line.startswith('>'):
      continue
   data.append(line.strip())
print ''.join(data)

Second, check out http://biopython.org/wiki/Main_Page.  I'm sure somebody 
has solved this problem before.



More information about the Python-list mailing list