[Chicago] Perl Follow-up

Clyde Forrester clydeforrester at gmail.com
Sat Mar 13 06:45:02 CET 2010


And the results are in. I have a Perl program and a Python program, each 
of which read a 60MB human Y chromosome, and compute the reverse 
complement. The Perl program takes about 15 seconds, and the Python 
program does it in about 3 seconds. The Python program also tends to be 
slightly more compact.

> # revcomp.py
> 
> import string
> 
> chrY_line_list = []
> for line in open('chrY.fa','r'):
>   if line[0] == '>':
>     continue
>   else:
>     chrY_line_list.append(line.strip('\n'))
> chrY_string = ''.join(chrY_line_list)
> print len(chrY_string)
> print chrY_string[10000:10020]
> 
> chrY_revcomp = chrY_string[::-1]
> trans = string.maketrans("ACGTacgt", "TGCAtgca")
> chrY_revcomp = chrY_revcomp.translate(trans)
> print len(chrY_revcomp)
> print chrY_revcomp[-10020:-10000]

> # revcomp.pl
> 
> use strict;
> use warnings;
> 
> my $infile = 'chrY.fa';
> open (my $in,'<',$infile) or die "Can't read $infile: $!\n";
> chomp(my @dna = <$in>);
> close ($in);
> 
> while(substr($dna[0],0,1) eq '>') {
>   shift(@dna);
> }
> 
> my $dna = join('', at dna);
> my $length = length($dna);
> print "DNA length: $length\n";
> my $substr = substr($dna,10000,20);
> print "$substr\n";
> 
> my $revcomp = reverse $dna;
> $revcomp =~ tr/ACGTacgt/TGCAtgca/;
> $length = length($revcomp);
> print "revcomp length: $length\n";
> $substr = substr($revcomp,-10020,20);
> print "$substr\n";


Alex Gaynor wrote:
> On Fri, Mar 12, 2010 at 12:54 PM, Clyde Forrester
> <clydeforrester at gmail.com> wrote:
>> I raised some issues about Perl vs. Python, and I'd like to invite some
>> comment and advice.
>>
>> First, can anyone recommend a properly Pythonic way of doing translations?
>>
>> One example of such translations would be complementing DNA sequences.
>> Translating T to A, A to T, C to G, and G to C.
>>
> 
>>>> import string
>>>> trans = string.maketrans("TACG", "ATGC")
>>>> my_dna = "agtcaagta".upper()
>>>> my_dna.translate(trans)


More information about the Chicago mailing list