[Tutor] Phyton script for fasta file (seek help)

Danny Yoo dyoo at hashcollision.org
Wed Apr 10 17:32:55 CEST 2013


Hi Ali,

Again, I recommend not reinventing a FASTA parser unless you really
need something custom here.  In this particular case, the function
ReadFasta here is slow on large inputs.  The culprit is the set of
lines:

            if line[0]=='>':
                prevLine=line[1:]
                dictFasta[prevLine]=''
            else:
                dictFasta[prevLine]=dictFasta[prevLine]+line
which looks innocent on its own, but it is an O(n^2) string-appending
algorithm if we walk across a very long sequence such as a
chromosome. The folks who have written the Biopython FASTA parser have
almost certainly already considered this pitfall.


More information about the Tutor mailing list