[Tutor] Selecting text
kumar s
ps_python at yahoo.com
Wed Jan 19 05:12:32 CET 2005
Dear group:
I have two lists:
1. Lseq:
>>> len(Lseq)
30673
>>> Lseq[20:25]
['NM_025164', 'NM_025164', 'NM_012384', 'NM_006380',
'NM_007032','NM_014332']
2. refseq:
>>> len(refseq)
1080945
>>> refseq[0:25]
['>gi|10047089|ref|NM_014332.1| Homo sapiens small
muscle protein, X-linked (SMPX), mRNA',
'GTTCTCAATACCGGGAGAGGCACAGAGCTATTTCAGCCACATGAAAAGCATCGGAATTGAGATCGCAGCT',
'CAGAGGACACCGGGCGCCCCTTCCACCTTCCAAGGAGCTTTGTATTCTTGCATCTGGCTGCCTGGGACTT',
'CCCTTAGGCAGTAAACAAATACATAAAGCAGGGATAAGACTGCATGAATATGTCGAAACAGCCAGTTTCC',
'AATGTTAGAGCCATCCAGGCAAATATCAATATTCCAATGGGAGCCTTTCGGCCAGGAGCAGGTCAACCCC',
'CCAGAAGAAAAGAATGTACTCCTGAAGTGGAGGAGGGTGTTCCTCCCACCTCGGATGAGGAGAAGAAGCC',
'AATTCCAGGAGCGAAGAAACTTCCAGGACCTGCAGTCAATCTATCGGAAATCCAGAATATTAAAAGTGAA',
'CTAAAATATGTCCCCAAAGCTGAACAGTAGTAGGAAGAAAAAAGGATTGATGTGAAGAAATAAAGAGGCA',
'GAAGATGGATTCAATAGCTCACTAAAATTTTATATATTTGTATGATGATTGTGAACCTCCTGAATGCCTG',
'AGACTCTAGCAGAAATGGCCTGTTTGTACATTTATATCTCTTCCTTCTAGTTGGCTGTATTTCTTACTTT',
'ATCTTCATTTTTGGCACCTCACAGAACAAATTAGCCCATAAATTCAACACCTGGAGGGTGTGGTTTTGAG',
'GAGGGATATGATTTTATGGAGAATGATATGGCAATGTGCCTAACGATTTTGATGAAAAGTTTCCCAAGCT',
'ACTTCCTACAGTATTTTGGTCAATATTTGGAATGCGTTTTAGTTCTTCACCTTTTAAATTATGTCACTAA',
'ACTTTGTATGAGTTCAAATAAATATTTGACTAAATGTAAAATGTGA',
'>gi|10047091|ref|NM_013259.1| Homo sapiens neuronal
protein (NP25), mRNA',
'TGTGCTGCTATTGTGTGGATGCCGCGCGTGTCTTCTCTTCTTTCCAGAGATGGCTAACAGGGGCCCGAGC',
'TATGGCTTAAGCCGAGAGGTGCAGGAGAAGATCGAGCAGAAGTATGATGCGGACCTGGAGAACAAGCTGG',
'TGGACTGGATCATCCTGCAGTGCGCCGAGGACATAGAGCACCCGCCCCCCGGCAGGGCCCATTTTCAGAA',
'ATGGTTAATGGACGGGACGGTCCTGTGCAAGCTGATAAATAGTTTATACCCACCAGGACAAGAGCCCATA',
'CCCAAGATCTCAGAGTCAAAGATGGCTTTTAAGCAGATGGAGCAAATCTCCCAGTTCCTAAAAGCTGCGG',
'AGACCTATGGTGTCAGAACCACCGACATCTTTCAGACGGTGGATCTATGGGAAGGGAAGGACATGGCAGC',
'TGTGCAGAGGACCCTGATGGCTTTAGGCAGCGTTGCAGTCACCAAGGATGATGGCTGCTATCGGGGAGAG',
'CCATCCTGGTTTCACAGGAAAGCCCAGCAGAATCGGAGAGGCTTTTCCGAGGAGCAGCTTCGCCAGGGAC',
'AGAACGTAATAGGCCTGCAGATGGGCAGCAACAAGGGAGCCTCCCAGGCGGGCATGACAGGGTACGGGAT',
'GCCCAGGCAGATCATGTTAGGACGCGGCATCCTGCCCCTGGTAGAGAGGACGAATGTTCCACACCATGGT']
If Lseq[i] is present in refseq[k], then I am
interested in printing starting from refseq[k] until
the element that starts with '>' sign.
my Lseq has NM_014332 element and this is also present
in second list refseq. I want to print starting from
element where NM_014332 is present until next element
that starts with '>' sign.
In this case, it would be:
'>gi|10047089|ref|NM_014332.1| Homo sapiens small
muscle protein, X-linked (SMPX), mRNA',
'GTTCTCAATACCGGGAGAGGCACAGAGCTATTTCAGCCACATGAAAAGCATCGGAATTGAGATCGCAGCT',
'CAGAGGACACCGGGCGCCCCTTCCACCTTCCAAGGAGCTTTGTATTCTTGCATCTGGCTGCCTGGGACTT',
'CCCTTAGGCAGTAAACAAATACATAAAGCAGGGATAAGACTGCATGAATATGTCGAAACAGCCAGTTTCC',
'AATGTTAGAGCCATCCAGGCAAATATCAATATTCCAATGGGAGCCTTTCGGCCAGGAGCAGGTCAACCCC',
'CCAGAAGAAAAGAATGTACTCCTGAAGTGGAGGAGGGTGTTCCTCCCACCTCGGATGAGGAGAAGAAGCC',
'AATTCCAGGAGCGAAGAAACTTCCAGGACCTGCAGTCAATCTATCGGAAATCCAGAATATTAAAAGTGAA',
'CTAAAATATGTCCCCAAAGCTGAACAGTAGTAGGAAGAAAAAAGGATTGATGTGAAGAAATAAAGAGGCA',
'GAAGATGGATTCAATAGCTCACTAAAATTTTATATATTTGTATGATGATTGTGAACCTCCTGAATGCCTG',
'AGACTCTAGCAGAAATGGCCTGTTTGTACATTTATATCTCTTCCTTCTAGTTGGCTGTATTTCTTACTTT',
'ATCTTCATTTTTGGCACCTCACAGAACAAATTAGCCCATAAATTCAACACCTGGAGGGTGTGGTTTTGAG',
'GAGGGATATGATTTTATGGAGAATGATATGGCAATGTGCCTAACGATTTTGATGAAAAGTTTCCCAAGCT',
'ACTTCCTACAGTATTTTGGTCAATATTTGGAATGCGTTTTAGTTCTTCACCTTTTAAATTATGTCACTAA',
'ACTTTGTATGAGTTCAAATAAATATTTGACTAAATGTAAAATGTGA'
I could not think of any smart way to do this,
although I have tried like this:
>>> for ele1 in Lseq:
for ele2 in refseq:
if ele1 in ele2:
k = ele2
s = refseq[ele2].startswith('>')
print k,s
Traceback (most recent call last):
File "<pyshell#261>", line 5, in -toplevel-
s = refseq[ele2].startswith('>')
TypeError: list indices must be integers
I do not know how to dictate to python to select lines
between two > symbols.
Could any one help me thanks.
K
__________________________________
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250
More information about the Tutor
mailing list