String Splitter Brain Teaser

Michael Spencer mahs at telcopartners.com
Mon Mar 28 12:18:38 EST 2005


Bill Mill wrote:

> for very long genomes he might want a generator:
> 
> def xgen(s):
>     l = len(s) - 1
>     e = enumerate(s)
>     for i,c in e:
>         if i < l and s[i+1] == '/':
>             e.next()
>             i2, c2 = e.next()
>             yield [c, c2]
>         else:
>             yield [c]
> 
> 
>>>>for g in xgen('ATT/GATA/G'): print g
> 
> ...
> ['A']
> ['T']
> ['T', 'G']
> ['A']
> ['T']
> ['A', 'G']
> 
> Peace
> Bill Mill
> bill.mill at gmail.com

works according to the original spec, but there are a couple of issues:

1. the output is specified to be a list, so delaying the creation of the list 
isn't a win

2. this version fails down in the presence of "double degeneracies" (if that's 
what they should be called) - which were not in the OP spec, but which cropped 
up in a later post :
  >>> list(xgen("AGC/C/TGA/T"))
  [['A'], ['G'], ['C', 'C'], ['/'], ['T'], ['G'], ['A', 'T']]

Michael




More information about the Python-list mailing list