[Tutor] Phyton script for fasta file (seek help)
michelle_low
michelle_low at zoho.com
Sun Mar 24 08:45:49 CET 2013
Hi everyone,
Can someone please help me with the following phyton script? I received the error message DeprecationWarning: the sets module is deprecated
from sets import Set.
After googling, I have tried the methods others suggest: change sets to set or delete the from sets import Set but none of them works.
Can someone suggest me how to modify the following codes so that the input file is read from standard input?
I'd like to execute them with unix command
script.py < sequence.fna
Thanks a bunch.
#!/usr/local/bin/python
import math
from sets import Set
line = file("sequence.fna", "r")
for x in line:
if x [0] == ">" :
#determine the length of sequences
s=line.next()
s=s.rstrip()
length = len(s)
# determine the GC content
G = s.count('G')
C = s.count('C')
GC= 100 * (float(G + C) / length)
stList = list(s)
alphabet = list(Set(stList))
freqList = []
for symbol in alphabet:
ctr = 0
for sym in stList:
if sym == symbol:
ctr += 1
freqList.append(float(ctr)/len(stList))
# Shannon entropy
ent = 0.0
for freq in freqList:
ent = ent + freq * math.log(freq, 2)
ent = -ent
print x
print "Length:" , length
print "G+C:" ,round(GC),"%"
print 'Shannon entropy:'
print ent
print 'Minimum number of bits required to encode each symbol:'
print int(math.ceil(ent))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130324/af9406ac/attachment.html>
More information about the Tutor
mailing list