Python module for DNA to amino acid and reverse complement translation.
David Mathog
mathog at seqaxp.bio.caltech.edu
Tue Sep 5 23:39:43 EDT 2000
In article <etdd7ioult0.fsf at w20-575-31.mit.edu>, Alex <cut_me_out at hotmail.com> writes:
>
>
>
>
>Hi. Here is a python wrapper around some simple C functions that
>translate DNA sequences into amino acid residues and give their reverse
>complements. I guess it's something like 10 times faster than the pure
>python versions, but I haven't done any benchmarking.
>
>
>No doubt there is already a module out there that does these things.
>Apologies to whoever's work I'm duplicating.
>
Just for kicks you might want to compare this one with yours (for
speed):
http://seqaxp.bio.caltech.edu/pub/SOFTWARE/FASTTRANS.C
I use this to translate genpept or nr on the fly into either full
translated frames or the set of ORF's greater than (for instance) 75 AA.)
It will handle a single DBA sequence entry up to 1 MB (use another program
in the pipe to fragment larger sequences if you expect to encounter them.)
$ fasttrans
usage (UNIX): fasttrans 123456 [minAA] <in.nfa >out.pfa
usage (OpenVMS): pipe fasttrans 123456 [minAA] <in.nfa >out.pfa
input is a fasta dna sequence via stdin
output is the translated protein sequence via stdout
Specify the set of frames to translate on command line
1,2,3 are the 3 forward frames
4,5,6 are the 3 reverse frames
minAA is an optional value. If present and greater than zero
it emits each ORF that has at least that many AA residues
in it as a separate fasta fragment. If not present or set to zero
or less the entire translated frame is emitted
This is ANSII C code and compiles cleanly on VMS and Linux (and probably
everywhere else.)
Regards,
David Mathog
mathog at seqaxp.bio.caltech.edu
Manager, sequence analysis facility, biology division, Caltech
More information about the Python-list
mailing list