[Tutor] Concatenating multiple lines into one
Peter Otten
__peter__ at web.de
Fri Feb 10 18:08:28 CET 2012
Spyros Charonis wrote:
> Dear python community,
>
> I have a file where I store sequences that each have a header. The
> structure of the file is as such:
>
>>sp|(some code) =>1st header
> ATTTTGGCGG
> MNKPLOI
> .....
> .....
>
>>sp|(some code) => 2nd header
> AAAAAA
> GGGG ...
> .........
>
> ......
>
> I am looking to implement a logical structure that would allow me to group
> each of the sequences (spread on multiple lines) into a single string. So
> instead of having the letters spread on multiple lines I would be able to
> have 'ATTTTGGCGGMNKP....' as a single string that could be indexed.
>
> This snipped is good for isolating the sequences (=stripping headers and
> skipping blank lines) but how could I concatenate each sequence in order
> to get one string per sequence?
>
>>>> for line in align_file:
> ... if line.startswith('>sp'):
> ... continue
> ... elif not line.strip():
> ... continue
> ... else:
> ... print line
>
> (... is just OS X terminal notation, nothing programmatic)
>
> Many thanks in advance.
Instead of printing the line directly collect it in a list (without trailing
"\n"). When you encounter a line starting with ">sp" check if that list is
non-empty, and if so print "".join(parts), assuming the list is called
parts, and start with a fresh list. Don't forget to print any leftover data
in the list once the for loop has terminated.
More information about the Tutor
mailing list