String Splitter Brain Teaser

James Stroud jstroud at mbi.ucla.edu
Sun Mar 27 17:39:06 EST 2005


Hello,

I have strings represented as a combination of an alphabet (AGCT) and a an 
operator "/", that signifies degeneracy. I want to split these strings into 
lists of lists, where the degeneracies are members of the same list and 
non-degenerates are members of single item lists. An example will clarify 
this:

"ATT/GATA/G"

gets split to

[['A'], ['T'], ['T', 'G'], ['A'], ['T'], ['A', 'G']]

I have written a very ugly function to do this (listed below for the curious), 
but intuitively I think this should only take a couple of lines for one 
skilled in regex and/or listcomp. Any takers?

James

p.s. Here is the ugly function I wrote:

def build_consensus(astr):

  consensus = []       # the lol that will be returned
  possibilities = []   # one element of consensus
  consecutives = 0     # keeps track of how many in a row

  for achar in astr:
    if (achar == "/"):
      consecutives = 0
      continue
    else:
      consecutives += 1
    if (consecutives > 1):
      consensus.append(possibilities)
      possibilities = [achar]
    else:
      possibilities.append(achar)
  if possibilities:
    consensus.append(possibilities)
  return consensus

--
James Stroud, Ph.D.
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/



More information about the Python-list mailing list