[Tutor] Phyton script for fasta file (seek help)

michelle_low michelle_low at zoho.com
Sun Mar 24 07:50:05 CET 2013



Hi everyone,




Can someone please help me with the following phyton script? I received the error message  DeprecationWarning: the sets module is deprecated
  from sets import Set.


After googling, I have tried the methods others suggest:  change sets to set or delete the from sets import Set but none of them works. 


Can someone suggest me how to modify the following codes so that the input file is read from standard input?
I'd like to execute them with unix command 


script.py <  sequence.fna




Thanks a bunch.





#!/usr/local/bin/python


import math
from sets import Set




line = file("sequence.fna", "r")


for x in line:
  if x [0] == ">" :


#determine the length of sequences
    s=line.next()
    s=s.rstrip()
    length = len(s)


# determine the GC content
    G = s.count('G')
    C = s.count('C')
    GC= 100 * (float(G + C) / length)
   


    stList = list(s)
    alphabet = list(Set(stList))


    freqList = []
    for symbol in alphabet:
      ctr = 0
      for sym in stList:
        if sym == symbol:
            ctr += 1

    freqList.append(float(ctr)/len(stList))


# Shannon entropy
  ent = 0.0
  for freq in freqList:
    ent = ent + freq * math.log(freq, 2)
  ent = -ent


  print x
  print "Length:" , length

  print "G+C:" ,round(GC),"%"
  print 'Shannon entropy:'
  print ent
  print 'Minimum number of bits required to encode each symbol:'
  print int(math.ceil(ent))


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130323/c74b61c4/attachment.html>


More information about the Tutor mailing list