code for Computer Language Shootout

Steven Bethard steven.bethard at gmail.com
Wed Mar 16 00:45:48 EST 2005


Jacob Lee wrote:
> There are a bunch of new tests up at shootout.alioth.debian.org for which
> Python does not yet have code. I've taken a crack at one of them, a task
> to print the reverse complement of a gene transcription. Since there are a
> lot of minds on this newsgroup that are much better at optimization than
> I, I'm posting the code I came up with to see if anyone sees any
> opportunities for substantial improvement. Without further ado:
> 
> table = string.maketrans('ACBDGHK\nMNSRUTWVY', 'TGVHCDM\nKNSYAAWBR')
> 
> def show(s):
>     i = 0
>     for char in s.upper().translate(table)[::-1]:
>         if i == 60:
>             print
>             i = 0
>         sys.stdout.write(char)
>         i += 1
>     print
> 
> def main():
>     seq = ''
>     for line in sys.stdin:
>         if line[0] == '>' or line[0] == ';':
>             if seq != '':
>                 show(seq)
>                 seq = ''
>             print line,
>         else:
>             seq += line[:-1]
>     show(seq)
> 
> main()

Don't know if this is faster for your data, but I think you could also 
write this as (untested):

# table as default argument value so you don't have to do
# a global lookup each time it's used

def show(seq, table=string.maketrans('ACBDGHK\nMNSRUTWVY',
                                      'TGVHCDM\nKNSYAAWBR')
     seq = seq.upper().translate(table)[::-1]
     # print string in slices of length 60
     for i in range(0, len(seq), 60):
         print seq[i:i+60]

def main():
     seq = []
     # alias methods to avoid repeated lookup
     join = ''.join
     append = seq.append
     for line in sys.stdin:
         # note only one "line[0]" by using "in" test
         if line[0] in ';>':
             # note no need to check if seq is empty; show now prints
             # nothing for an empty string
             show(join(seq))
             print line,
             del seq[:]
         else:
             append(line[:-1])


> Making seq into a list instead of a string (and using .extend instead of
> the + operator) didn't give any speed improvements. Neither did using a
> dictionary instead of the translate function, or using reversed() instead
> of s[::-1]. The latter surprised me, since I would have guessed using an
> iterator to be more efficient. Since the shootout also tests memory usage,
> should I be using reversed for that reason?

reversed() won't save you any memory -- you're already loading the 
entire string into memory anyway.


Interesting tidbit:
     del seq[:]
tests faster than
     seq = []

$ python -m timeit -s "lst = range(1000)" "lst = []"
10000000 loops, best of 3: 0.159 usec per loop

$ python -m timeit -s "lst = range(1000)" "del lst[:]"
10000000 loops, best of 3: 0.134 usec per loop

It's probably the right way to go in this case anyway -- no need to 
create a new empty list each time.

STeVe



More information about the Python-list mailing list