[Tutor] Phyton script for fasta file (seek help)
Peter Otten
__peter__ at web.de
Sun Mar 24 20:24:48 CET 2013
michelle_low wrote:
> Hi Peter,
> Thanks for your help.
You're welcome. In the future, please send your mail to the list, not to an
individual poster. That way more people get a chance to read it and you are
more likely to get help.
> I have followed your advice:
>
>
> -removed the line
>
> from sets import Set
>
> completely and replace Set with set
>
>
> -Replace the line
>
> > line = file("sequence.fna", "r")
>
> with
>
> import sys
> line = sys.stdin
> However, when I tried to run the script using command ./script.py <
> sequence.fna . nothing came up. The code just does't run. I used to be
able
> to execute the code and it printed out the list. Now I'm clueless and
stuck again.
If it worked to your satisfaction before I suggest that you doublecheck you
didn't accidentally make changes other than those I suggested.
> Can you please look at my script below? Thanks a million
I'd rather not ;) Seriously, without a clear notion of what the script is
supposed to do it is hard to fix its behaviour...
> #!/usr/local/bin/python
>
>
> import math
> import sys
>
>
> line = sys.stdin
>
>
> for x in line:
> if x [0] == ">" :
I suppose that's a ">" in your actual code. Otherwise the condition is
always False -- which would explain why you see nothing.
> #determine the length of sequences
> s=line.next()
> s=s.rstrip()
> length = len(s)
>
>
> # determine the GC content
> G = s.count('G')
> C = s.count('C')
> GC= 100 * (float(G + C) / length)
>
>
>
>
> stList = list(s)
> alphabet = list(set(stList))
>
>
> freqList = []
> for symbol in alphabet:
> ctr = 0
> for sym in stList:
> if sym == symbol:
> ctr += 1
>
> freqList.append(float(ctr)/len(stList))
The above line gave me an IndentationError. It should probably be
> freqList.append(float(ctr)/len(stList))
Had you answered Alan (publicly) there'd be no need to guess...
> # Shannon entropy
> ent = 0.0
> for freq in freqList:
> ent = ent + freq * math.log(freq, 2)
> ent = -ent
>
>
> print x
> print "Length:" , length
> print "G+C:" ,round(GC),"%"
> print 'Shannon entropy:'
> print ent
> print 'Minimum number of bits required to encode each symbol:'
> print int(math.ceil(ent))
PS: if you reuse code from someone else:
<http://code.activestate.com/recipes/577476-shannon-entropy-calculation/>
it's a good idea to give credits.
More information about the Tutor
mailing list