[Tutor] Concatenating multiple lines into one

Spyros Charonis s.charonis at gmail.com
Sun Feb 12 20:38:51 CET 2012


Thanks for all the help, Peter's and Hugo's methods worked well in
concatenating multiple lines into a single data structure!

S

On Fri, Feb 10, 2012 at 5:30 PM, Mark Lawrence <breamoreboy at yahoo.co.uk>wrote:

> On 10/02/2012 17:08, Peter Otten wrote:
>
>> Spyros Charonis wrote:
>>
>>  Dear python community,
>>>
>>> I have a file where I store sequences that each have a header. The
>>> structure of the file is as such:
>>>
>>>  sp|(some code) =>1st header
>>>>
>>> ATTTTGGCGG
>>> MNKPLOI
>>> .....
>>> .....
>>>
>>>  sp|(some code) =>  2nd header
>>>>
>>> AAAAAA
>>> GGGG ...
>>> .........
>>>
>>> ......
>>>
>>> I am looking to implement a logical structure that would allow me to
>>> group
>>> each of the sequences (spread on multiple lines) into a single string. So
>>> instead of having the letters spread on multiple lines I would be able to
>>> have 'ATTTTGGCGGMNKP....' as a single string that could be indexed.
>>>
>>> This snipped is good for isolating the sequences (=stripping headers and
>>> skipping blank lines) but how could I concatenate each sequence in order
>>> to get one string per sequence?
>>>
>>>  for line in align_file:
>>>>>>
>>>>> ...     if line.startswith('>sp'):
>>> ...             continue
>>> ...     elif not line.strip():
>>> ...             continue
>>> ...     else:
>>> ...             print line
>>>
>>> (... is just OS X terminal notation, nothing programmatic)
>>>
>>> Many thanks in advance.
>>>
>>
>> Instead of printing the line directly collect it in a list (without
>> trailing
>> "\n"). When you encounter a line starting with">sp" check if that list is
>> non-empty, and if so print "".join(parts), assuming the list is called
>> parts, and start with a fresh list. Don't forget to print any leftover
>> data
>> in the list once the for loop has terminated.
>>
>> ______________________________**_________________
>> Tutor maillist  -  Tutor at python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>>
>>
> The advice from Peter is sound if the strings could grow very large but
> you can simply concatenate the parts if they are not.  For the indexing
> simply store your data in a dict.
>
> --
> Cheers.
>
> Mark Lawrence.
>
>
> ______________________________**_________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120212/49c45f3f/attachment.html>


More information about the Tutor mailing list